Page is a not externally linkable
lucy24 - 5:04 am on Dec 8, 2012 (gmt 0)
:: further bump ::
Stop the bleepin presses. This just in. Verbatim from logs with a minimum of snips. The Forums will probably eat the distinctive double-spaces in each UA.
131.253.36.202 - - [07/Dec/2012:11:31:28 -0800] "GET /fonts/naamajut.html HTTP/1.1" 200 4774 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.50727)"
131.253.36.202 - - [07/Dec/2012:11:31:29 -0800] "GET /piwik/piwik.js HTTP/1.1" 200 21927 "http://www.example.com/fonts/naamajut.html" {same}
131.253.36.202 - - [07/Dec/2012:11:31:30 -0800] "GET /sharedstyles.css HTTP/1.1" 200 2984 {et cetera}
131.253.36.206 - - [07/Dec/2012:11:31:31 -0800] "GET /fonts/fontstyles.css HTTP/1.1" 200 3191 {et cetera}
131.253.36.205 - - [07/Dec/2012:11:31:33 -0800] "GET /piwik/piwik.php?action_name=Naamajut& {et cetera}
131.253.26.244 - - [07/Dec/2012:13:08:06 -0800] "GET /fonts/legacy.html HTTP/1.1" 200 10247 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607)"
131.253.26.244 - - [07/Dec/2012:13:08:06 -0800] "GET /piwik/piwik.js HTTP/1.1" 200 21927 "http://www.example.com/fonts/legacy.html" {same}
131.253.26.244 - - [07/Dec/2012:13:08:07 -0800] "GET /sharedstyles.css HTTP/1.1" 200 2984 {et cetera}
131.253.26.244 - - [07/Dec/2012:13:08:07 -0800] "GET /fonts/fontstyles.css HTTP/1.1" 200 3190 {et cetera}
65.55.212.65 - - [07/Dec/2012:13:08:13 -0800] "GET /piwik/piwik.php?action_name=Legacy%20Fonts& {et cetera}
Note the dutiful collection of all associated files-- except images (7 on one page, 31 on the other).
I checked previous days: this is brand-new. If it hadn't been for that sudden IP swap at the end I wouldn't even have noticed-- all that css and js activity was enough to make it pass for human. I'd never got around to blocking the plainclothes bingbot from this address. Never thought of it, in fact.
Notice the piwik queries? Normally with robots it's simply "idsite=1" identifying the domain. This was a full-blown query string in every detail, meaning that the visitor executed the preceding javascript and sent humanoid information. Most of it is so much Hungarian to me, but the very last item is
&res=800x600
Is it now. Fancy that. And what are we to make of that six-second pause between the last two requests? Is that Robot A (131.253.) signing off and handing its unfinished jobs to Robot B (65.55)?
Tangential discovery made while investigating this one: around the end of August, msnbot-media abruptly dropped its old tripartite system-- the one I described at the beginning of this thread that goes "robots.txt, one image, page". It went on a couple of robots.txt binges-- my record seems to be 34 in a 24-hour period, still no match for the ordinary bingbot-- and has now settled into one or two robots.txt alternating with one image. Generally a very minor gif.
File under: wtf?