Msg#: 4641996 posted 1:25 am on Feb 3, 2014 (gmt 0)
I put "now what?" in the header because I've only just noticed this behavior while --stop me if you've heard this one-- looking for something else. Poring over raw logs tells me it's been happening since April 2013.
What it's doing: The BingPreview user-agent is picking up selected supporting files belonging to specific pages, which it helpfully identifies in the referer slot. Not all supporting files, and never the page itself. For that you have to look back 61 minutes (really) earlier in logs, where you find the ordinary bingbot getting the page. It seems to focus on one page for a while, and then turns its attention to a different one.
Here is the earliest specimen I can find in logs. This particular page continued into May, but by then it was branching out into others. Note the timestamps. It isn't coincidence; it's always like that.
"bingbot" = Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) "BingPreview" = Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b
Further notes: piwik is analytics. Ordinary robots are barred; previews require brute force. The image /list_tape.png is visible on the page but is never mentioned in the page's own html; it's only referenced in CSS. (I had to look this up. I thought it would give .css as referer, but it really does come through as the page itself.) The page uses one other image file, which was never requested.
Msg#: 4641996 posted 4:40 am on Feb 3, 2014 (gmt 0)
Yes, there were pages of discussion here about that back then. Bing Preview shows a cached thumbnail version (that they create) of images in images search and if someone clicks to see it larger, they politely deliver the image, the page, its css and js files. Isn't that special? Saves you all that actual visitor bandwidth, you know. Best part? Because the visitor is not actually on your site, you never get to know what their IP or activity is. |Preview| is in the list of blocked UAs for some of my sites where the images are not for 'borrowing'.
Msg#: 4641996 posted 7:29 am on Feb 3, 2014 (gmt 0)
I remember when Bing Preview first came out in 2012 there was lots of discussion. Did anyone ever explain the 61 minutes and 20 seconds part? The most recent incident-- the one I finally noticed-- was:
21:43:03 page request by bingbot 22:44:43 BingPreview requests for supporting files
Oops, that one's 61 minutes and 40 seconds. Guess they're losing speed.