Page is a not externally linkable
lucy24 - 10:58 pm on Aug 21, 2011 (gmt 0)
Holy ###. I was all set to start a thread on the Bingbot's new clothes, and here it's been going on for a year and I'm just late to the party. Here's my version anyway (cut&paste from draft because I had to pull in and edit a bunch of logs):
____
I just noticed this today; checked back and found a few more over the last couple of days. The first occurrence I found was on the 10th, and then nothing else back to mid-July when I got tired of looking. (It's too complicated for Spotlight, so you have to open the individual log files. Yawn.)
I have to put this first attempt in full:
65.52.32.124 - - [10/Aug/2011:16:49:10] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2303 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:49:20] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:49:30] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:49:40] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:49:51] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:50:01] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:50:11] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
65.52.32.124 - - [10/Aug/2011:16:50:21] "GET /fun/AlonzoMelissa.html HTTP/1.1" 403 2247 "-" "-"
A bit slow on the uptake, are we, bingbot? (So was I, because at the time I must have just glanced at the series of 403s and at the requested title and assumed it was my Ukrainians, forgetting that they now get a 301.) Blank UAs get slapped with an automatic 403. No use saying "But I'm from Bing! Honest! I just didn't make it to the laundromat in time!"
After that, they must have got the message, because when they tried it again there was a pattern:
207.46.199.193 - - [19/Aug/2011:21:50:55] "GET /robots.txt HTTP/1.1" 200 806 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.50.11 - - [19/Aug/2011:21:57:53] {robots.txt} "-" {bingbot}
65.52.108.24 - - [19/Aug/2011:22:09:09] {robots.txt} "-" {bingbot}
65.52.108.24 - - [19/Aug/2011:22:09:46] {one html file} "-" "-"
65.52.108.24 - - [19/Aug/2011:23:26:46] {another html file} "-" "-"
157.55.16.87 - - [20/Aug/2011:18:31:57] {robots.txt} "-" {bingbot}
157.55.16.87 - - [20/Aug/2011:18:32:29] {one html file} "-" "-"
Then, to show that they still know how to do it right:
65.52.104.21 - - [20/Aug/2011:21:22:07] {robots.txt} 301 "-" {bingbot}
65.52.104.21 - - [20/Aug/2011:21:22:08] {robots.txt} "-" {bingbot}
65.52.104.21 - - [20/Aug/2011:21:22:46] {one html file} 301 "-" {bingbot}
Logs don't say, but the 301 here is because they were aiming for a without-www URL. Note that they didn't follow the 301 and pick up the actual file. This is normal for the bingbot on my site. (I got curious. They picked up the correctly named file back in early July, so presumably said "Naah, not worth the trouble" this time around.)
157.55.16.87 - - [21/Aug/2011:02:15:02] {robots.txt} "-" {bingbot}
157.55.16.87 - - [21/Aug/2011:02:15:59] {one html file} "-" "-"
157.55.16.87 - - [21/Aug/2011:06:11:14] {robots.txt} "-" {bingbot}
157.55.16.87 - - [21/Aug/2011:06:12:11] {one html file} "-" "-"
157.55.16.87 - - [21/Aug/2011:12:30:36] {robots.txt} "-" {bingbot}
157.55.16.87 - - [21/Aug/2011:12:31:29] {one html file} "-" "-"
Does this make any sense? At all? Whatsoever?