Forum Moderators: phranque

Message Too Old, No Replies

those nonexistent # links

         

lucy24

11:29 pm on Jun 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In our last episode, we'd established that the # fragment identifier will never ever show up in logs, except possibly in referers (depending apparently on UA). It's used internally by the browser; the server never even sees it.

So what am I to say when this

79.41.nn.nn - - [date] "GET /fonts/custom_greek.html#download HTTP/1.1" 200 11775 "-" "Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0"

or similarly this (same person-- and it's definitely a human person)

[same IP] - - [later date] "GET /fonts/custom_greek.html#background HTTP/1.1" 200 11775 "-" "[same UA]"

shows up? These are "cold" visits, not preceded by the unmarked page.

Some kind of glitch in Firefox 11 bookmarking? If so, it's platform-specific; I tried it myself (with a different fragment on a different page) and logs only showed the page.

Where's that "wtf" favicon?

g1smd

12:07 am on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



and it's definitely a human person

Sure? Really sure?

lucy24

12:33 am on Jun 10, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For starters, it's an Italian visiting a page that's popular with Italians. The GET requests containing fragments came after a couple of fragmentless visits to the same page several hours earlier in the day-- earlier in their time zone, that is-- with google.it as referer. Different and more plausible UA (Chrome for Mac visiting a Mac-specific page) but identical IP.

:: detour to re-inspect raw logs ::

Oh, now that's interesting. The first fragmented request was accompanied by requests for the images only (no css or js) from the same IP but UA "Java/1.6.0_31", which is blocked. The second one was immediately followed (not preceded) by a request for the fragmentless page only, with the Firefox-for-Windows UA.

They're real fragments, if it matters.

incrediBILL

5:13 pm on Jun 12, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, using CURL I can do "curl example.com/this#that" and it'll show this in the log file:

[12/Jun/2012:17:09:20 +0000] "GET /this#that HTTP/1.1" 404 1217 "-" "curl/7.15.5 (i386-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5"

Now if I change the user agent option to mimic a browser and use an Italian proxy IP address...

So it's definitely not definitely a person, it could be anything, most likely a bot IMO because browsers don't send the # stuff but other tools that directly open sockets do send it.

g1smd

8:13 pm on Jun 12, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've never seen a real browser transmit the #fragment to a server in a HTTP request.

There was some discussion here a few years back when one person thought that (and I can't remember which one it was) either Chrome or Safari had sent such a request to their server.