Forum Moderators: open
65.55.105.11 - - [03/Jul/2008:16:57:02 +0100] "GET /robots.txt HTTP/1.1" 200 4830 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.105.11 - - [03/Jul/2008:16:57:02 +0100] "GET /MyFolder/MyPage.html HTTP/1.1" 200 13114 "-" "msnbot/1.1 (+http://search.msn.com/msnbot.htm)"
65.55.109.zzz - - [03/Jul/2008:16:57:50 +0100] "GET /SameFolder/SamePage.html HTTP/1.0" 200 13114 "http://search.live.com/results.aspx?On-Topic-keyword" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)"
Interesting...
I'd like to spot check my logs and look for something similar, but MSN/Live banned my site after I complained to them about robots.txt violations... Nice policy, there... :(
Jim
Three subsequent page requests immediately followed.
It should be noted that this "same page" is part od frame-set.
The next three requests, were a result of the frame-set main page request. Still absent image requests.
but MSN/Live banned my site after I complained to them about robots.txt violations.
I've had some communication problems with SE's on my few inquiries, in which rather than seeking a solution to a particula problem, the respondent (at least as a general rule) interprets the inquiry as a request to cease all spidering.
As a result, I'm reluctant to even say hello as they pass by ;)
Don
Otherwise, the loss in traffic has been negligible.
I told MSN/Live what was wrong, and asked them to fix the problem. Instead, they just removed all my pages. These pages still get spidered, just not listed in the results. And as you know, I don't allow spiders to waste my bandwidth for nothing in return so I blocked them, too. Fair is fair.
Anyway, I don't want to hijack this thread, so enough about that.
There's a fair chance that they may be running a cloaking detector -- A good thing to look out for if they ever deign to list my site again. Since I use "user-agent delivery" to serve some custom content to different browsers, I need to be sure they can figure out that I'm not "cloaking to deceive" anybody.
Jim