Forum Moderators: open

Message Too Old, No Replies

Long referers

         

dstiles

8:43 pm on Sep 1, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Only noticed this today, possibly because there is not much bad traffic around at the moment so the security logs are easier to read. Or it could be new today.

Three hits today with excessively long REFERER fields; I say excessive, about 260-270 bytes, which I deem excessive.

My database (MySQL) is set to a Notes field length of TINYTEXT (255 bytes) which so far, in over five years, has not, as far as I know, recorded an error. Today these three REFERERs caused database errors due to length.

Time of accesses were spaced reasonably and do not appear to have been a concerted "bot" activity; they came from 3 different broadband ranges (Zen, BT and Opal/Carphone, all UK). The source machines could, of course, have been compromised but the target sites seemed appropriate. They were all logged by my system as "Bad Referer" although it will take me a while to figure out what was bad about them.

Listed below as UA and REFERER (domains and actual pages obscured by me, which has reduced the actual byte-count slightly)...

Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0CDMQFjAEahUKEwiK6tfR0dXHAhWHiw0KHVqiDaI&url=http%3A%2F%2Fwww.example1.co.uk%2Fpage.asp&ei=VX_lVcrrHIeXNtrEtpAK&usg=AFQjCNEyhi5S8TW-bMTqCJ11FwHWnlggBg&sig2=pGBe9At0GNC3YroQyk--2Q/page.asp

Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDsQFjAAahUKEwjD7fmg3dXHAhUEcRQKHQ24BlM&url=http%3A%2F%2Fwww.example2.com%2Fpage.asp&ei=hIvlVcObHoTiUY3wmpgF&usg=AFQjCNEclI--EmO7YfdvSDutp_oMdO2OyQ/page.asp

Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko
http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=11&ved=0CGIQFjAKahUKEwixuoCYr9bHAhXqF9sKHfF_AVc&url=http%3A%2F%2Fwww.example3.com%2Fpage.asp&ei=beHlVfHPGuqv7Abx_4W4BQ&usg=AFQjCNEclI--EmO7YfdvSDutp_oMdO2OyQ&sig2=CZkS3Fh-0F32SsOlc_EAIw/page.asp

All Trident 7 browsers from two different Windows OS's (but three actual machines judging by IP).

Does anyone know if G has extended the REFERER byte-count? Note they are not HTTPS; had it been I doubt I would have seen ANY proper REFERER.

[edited by: Ocean10000 at 12:21 am (utc) on Sep 2, 2015]
[edit reason] Unlinked [/edit]

lucy24

1:09 am on Sep 2, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The page you were on is trying to send you to [example3.com...]
(Whoops! There's that autolinking bug again. Guess that goes on the Beta Checklist.)
Yes, OK, so I overlooked where you'd edited the middle of the query string. But what, exactly, is google trying to tell me-- and why do they even have a "return to previous page" option if they don't know for a fact that there was a previous page?

Know what's even more unnerving? In the course of looking up something else recently, I discovered that certain unwelcome robots have been sending enormously long cookies-- to a site they've never visited before, and which rarely uses cookies at all apart from (optional) piwik. I guess the idea is that if a WP site thinks you've been there before, it will let you see more stuff. But, hey, if they're going to go sending cookies that I never set, that's just one more way to block 'em ;)

dstiles

6:45 pm on Sep 2, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



example3.com was the site on my server - same for all example domains in my examples. Is that really a "return to page..." in the referer?

I only set Temp (Session) cookies on IIS but some of those are amazingly long. I would like to dispense with cookies completely but some sites need state information and that's the only way, thanks to the amazingly stupid way web protocol was originally designed. In general I don't bother to look at cookies, though, so I can't comment on bot cookies.

lucy24

7:28 pm on Sep 2, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is that really a "return to page..." in the referer?

I pasted the entire referer (from http through page.asp) into my browser's address bar. The "return to page" part is the second line in Google's minimalist "Redirect" response screen. You can try it for yourself ;) (harmless, since all you get is a google page)

I can't comment on bot cookies

It was news to me too, but prompted me to add
RewriteCond %{HTTP_COOKIE} .
RewriteCond %{HTTP_COOKIE} !^(goodcookie|othergoodcookie|finalgoodcookie)
RewriteRule (^|\.html|/)$ - [F]
Unfortunately I can't figure out how to put in a lockout for null cookies, where they send the header but it's empty-- another robotic behavior.

If you're suspicious of super-long referers, you could say something like
RewriteCond {HTTP_REFERER} .{255,}

where 255 is your minimum unacceptable length. I don't remember mod_rewrite's exact syntax for the {a,b} construction, though I should because I once experimented on that very point. I don't suppose {255,65536} would hurt.

dstiles

6:55 pm on Sep 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



> harmless, since all you get is a google page

Oh, come on, Lucy - it's google, of course it's not harmless! :)

If that domain/page is a return page, how come it's in the first access of a fresh referer straight from an SE page? Looks very dodgy to me.

Isn't an empty field check just ^$ - of course not or you would have used it! :) I'm going on regex, of course, since sadly I do not have a linux web server.

lucy24

6:42 pm on Sep 4, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looks very dodgy to me.

I know from past experimentation that you can paste the URL from a SERP into a different browser, and it will behave exactly as if you'd done the searching in that second browser. For that matter, you can do the same thing with WebmasterWorld links ;)

Isn't an empty field check just ^$

Or !. ("no content") as the case may be. But that would also return "true" if the field was absent entirely. In the case of a User-Agent header it's a distinction without a difference-- that is, we don't care if the header is empty or absent, because it's no good either way. But in the case of cookies it's perfectly legitimate for a request to send no cookie header at all; it's only when they send an empty one that it's hinky.