Forum Moderators: open

Message Too Old, No Replies

Referer (sic) question

         

tangor

8:05 am on Sep 12, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



First time I've seen this. Something new or anything to be concerned with? Referer from the raw logs. Looks like this:

Referer: http://example.com/somefolder/somefile.html

or image or css ...

All logical, but I have never seen "Referer: " in that log column before. Single IP, one time only (this month). Polish ISP, comparatively normal "human" visit with appropriate number of images for html, images, css, and pdf requested.

UA is rather normal:

Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0

Posted as a query and a chuckle that referers are now tacking that appellation along the way!

lucy24

3:46 pm on Sep 12, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do you mean that the literal string “Referer: ” occurs as part of the referer? But only for some requests, not all of them? (If it were universal, you could point to someone at your host for being a dope.)

It could be a robot; I've occasionally met ones whose UA starts with “User-Agent: ” indicating that they didn’t spend enough time studying the Your New Robot manual. Or it could be a human experimenting with their browser’s Referer settings.

fwiw, the cited UA seems to be more popular with robots than humans.

:: poring over raw logs ::

On closer inspection, almost all occurrences of that UA are (a) from a specific IP in Poland, probably a compromised machine (b) requests to http site, receiving a redirect that is not followed-up. Huh.

:: search for literal string “Referer” ::

Oh, willya look at that. Same IP mentioned above.
95.160.35.abc - - [03/Sep/2020:09:08:11 -0700] "GET /ebooks/horn/ HTTP/1.1" 200 11678 "Referer: http://example.com/ebooks/horn/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0" 
That seems to be the only literal Referer within this calendar year.

tangor

3:13 am on Sep 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For fun I might let it have another bite, just to see what it actually does. This was, to all appearance, a "human-like" visit that took 167 hits (including css, images, html, pdf)

No strain, just STRANGE. :)

iamlost

2:51 pm on Sep 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There used to be (couple few years ago uncertain if still exists) a bug in Google Chrome such that, instead of (1) excluding the Header Referer field or (2) returning a value of [ about:blank ] as per RFC 7231, it reflected the page URL value much as your examples.

However as lucy24 reports FF 71 it could mean that the visitor uses the ‘Referer Modifier’ extension (enhancement: fake target domain referer [github.com]) or similar.

Else it could be signs of site AI incipience... self referential == self awareness...

lucy24

4:47 pm on Sep 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I feel confident in saying that the specific request I cited is a robot: IP with other attested robotic activity--most with the identical UA--no requests other than page. Since it appears to originate from a human ISP, it is possible the botrunner took the easy way out and is sending the actual UA of the compromised human. I don't know how FF numbering works on Linux. If it aligns with Mac and Windows, 71 would have to be called oldish.

I just went and checked the headers for this lone request. (Most requests from this IP get a 301, for which I don't log headers. This one was straight to https, not redirected; maybe they were experimenting with robot options.) Nothing blatantly robotic in the headers ... but I do note
Accept: */*
which is more characteristic of robots, including some-but-not-all legitimate ones.

While looking this up, I found that one (unrelated) robot consistently and dimwittedly says
Accept: */*;q=0.9,*/*;q=0.8
--but that particular robot would be blocked anyway.

dstiles

9:44 am on Sep 14, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



> don't know how FF numbering works on Linux.

Same version as other platforms. My current firefox (which I almost never use) is 80.0.1.

blend27

5:51 pm on Sep 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



... self referential == self awareness...

or == self denial

lucy24

7:51 pm on Sep 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Infuriatingly, there exists one generally law-abiding robot--I forget who, not a major search engine--that insists on putting the original request into the Referer slot when getting redirected (as from http to https). This is a flat violation of The Rules, which say you are supposed to preserve the original referer when making a redirected request. Hmph.

Another law-abiding robot gives your own root as referer when requesting that same root, requiring me to poke a hole.

I can't block all auto-referers--that is, I could, but it would require rewriting all page requests to php because mod_rewrite can’t do it--but I definitely bar the ones fitting some common patterns.

blend27

10:14 pm on Sep 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



New Rules!(love his show btw): 2 : in Ref gives you a boot! Simple as that.

blend27

10:28 pm on Sep 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The only thingy as far as could tell with the word Referer in my logs would be this(someone f..ing around):
 () { Referer; }; echo -e "Content-Type: text/plain\n"; echo -e "\0141\0142\0165\0156\0245\0464\0151\0170\0162\0150\0145\0153\0158\0163\0150\0157\0144\0153"

Sometime from Aug 2016.

That line was the actual Referer field value.

lucy24

10:30 pm on Sep 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yeah, some people really should spend more time reading the Your New Robot manual.

tangor

6:41 am on Sep 22, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yeah, some people really should spend more time reading the Your New Robot manual.


WHAT? And spoil all our fun?

Happy part is we can spot'em ... most of the time. Whew!

blend27

8:33 pm on Sep 23, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



..most of the time. Whew!..

That is interlinear with my daily outing in my garden. That darn dark-ash color bird which goes after my, my, my fig(sh) tree, horrible. And a senile alcoholic neighbor that keeps talking to his 4 make pretend dogs - combination of Polish, Arabic and 2 other not earthy gigs he learned....