http://www.webmasterworld.com Welcome to WebmasterWorld Guest from 38.103.63.18
register, login, search, glossary, subscribe, help, library, PubCon, announcements , recent posts, unanswered posts
Subscribe and Support WebmasterWorld
Home / Forums Index / Microsoft / Microsoft Search Live
Forum Library : Charter : Moderators: Receptional

Microsoft Search Live

  
MSN Bot gone crazy?
Loads of 404s
barns101


#:1532210
 12:44 am on June 7, 2006 (utc 0)

In the past week msnbot/0.9 (+http://search.msn.com/msnbot.htm) has increasingly been showing up in my 404 error log for bizarre requests for nonexistent files like http://www.example.com/forums/templates/Aeolus/images/lang_german/post.gif (which has never existed on my site, nor does the directory path).

At first I thought the user agent was spoofed but the IP address is 65.55.246.91 and this is in fact a genuine Microsoft IP.

The bot requests seemingly random files that have never existed on my site and the frequency is increasing. Can anyone explain what's going on? :)

tantalus


#:1532211
 10:00 am on June 9, 2006 (utc 0)

"At first I thought the user agent was spoofed but the IP address is 65.55.246.91 and this is in fact a genuine Microsoft IP. "

This UA had been spoofed comimg from various IPs that belonged to BT and NTL in th UK. So, doh! I banned it.

I'm now wondering if this is in part the cause of loss of rankings with the recent update?

larryhatch


#:1532212
 10:12 am on June 9, 2006 (utc 0)

I saw the same thing. Over and over, the same set of non-existent pages / files,
all requested from an apparently genuine MS DNS #.

I copied the files requested, all images, and traced those to a totally unrelated website.
[ My site is about UFOs. Other site was all about backyard pools and spas. ]

Anyhow, the "MS crawler" came back several times, each time calling the same set of bad files.
Then, suddenly, it just gave up and went away.

None of this really hurts me, I'm just curious what somebody is trying to accomplish here, and why.
It seems like a lot of work for no understandable reason.

Is this just log-spam? Just to get traffic to a pool and spa site?
If so, its counter-productive. -Larry

barns101


#:1532213
 1:13 pm on June 9, 2006 (utc 0)

Anyhow, the "MS crawler" came back several times, each time calling the same set of bad files.
Then, suddenly, it just gave up and went away.

Yes, it seems to have slowed/stopped now (although I have banned the UA from my forum directory).

Is this just log-spam? Just to get traffic to a pool and spa site?

A lot of the files requested from my website were to do with online gaming (my site is a directory of pubs in my city). I'm sure that there are no links out there pointing to all of these non-existent files on my website, so why would MSN try to retrieve them?

abates


#:1532214
 3:34 am on June 12, 2006 (utc 0)

I'm getting a lot of these too, always hits on graphics which I don't have and have never had on my site. It seems like this is a major bug in msnbot which has cropped up recently.

abates


#:1532215
 4:13 am on June 12, 2006 (utc 0)

Additional to the above: the earliest such hit I can find in my logs is dated the 13th of May, and the hits on my site always come from 65.55.246.76, though msnbot visits me from other IP addresses as well.

There was an initial series of hits from 13/05 to 17/05, and a second series which appears to have started on the 6th of June and is still going.

larryhatch


#:1532216
 4:46 am on June 12, 2006 (utc 0)

I've asked this before and never got a good answer.

IS it possible to spoof a DNS # like 65.55.246.76?

If so, it may have nothing to do with MSN and their crawlers. -Larry

barns101


#:1532217
 11:01 am on June 12, 2006 (utc 0)

Yes it is possible, but it's a lot of work for something that seems to have no benefit to the perpetrator (there is never a referring URL so no log spam).

BillyS


#:1532218
 12:09 pm on June 12, 2006 (utc 0)

>>I'm getting a lot of these too, always hits on graphics which I don't have and have never had on my site. It seems like this is a major bug in msnbot which has cropped up recently.

That's the interesting thing about WebmasterWorld, I just noticed this in my logs yesterday. MSN bot looking for these non-existent gifs in a non-existent directory.

Hey, at least my server is returning a 404 like it's supposed to.

msndude


#:1532219
 1:11 pm on June 21, 2006 (utc 0)

Is anyone still seeing this problem?

barns101


#:1532220
 11:02 pm on June 21, 2006 (utc 0)

It seems to have stopped. Can you tell us what the problem was?

msndude


#:1532221
 12:57 am on June 22, 2006 (utc 0)

Unfortunately not. If anyone sees it again, we'd be grateful if someone sent us the actual log data. (Just send me a sticky.) I realize you can't post the real thing in the forum, but without the actual records, we couldn't track it.

abates


#:1532222
 1:42 am on June 22, 2006 (utc 0)

The last hit I had of this kind was on the 13th. If I see it happening again, I'll send along some lines from my log. :)

Kelowna


#:1532223
 3:40 am on June 22, 2006 (utc 0)

is this the kind of thing you are talking about? I see a few in my log files that look odd...

65.54.188.18 - - [21/Jun/2006:19:58:23 -0400] "GET /index/www.commnet.edu/it/security/freecreditreport.asp HTTP/1.0" 404 248 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

65.54.188.20 - - [21/Jun/2006:19:52:44 -0400] "GET /index/www.frbsf.org/publications/consumer/credit.html HTTP/1.0" 404 247 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

I dont have any files that look anything like that so not sure where msnbot found that link?

barns101


#:1532224
 4:39 pm on June 22, 2006 (utc 0)

That's similar to what I was seeing, although mine was with msnbot/0.9

 

Home / Forums Index / Microsoft / Microsoft Search Live
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
WebmasterWorld ® and PubCon ® are a Registered Trademarks of WebmasterWorld Inc.
© WebmasterWorld Inc. / SearchEngineWorld 1996-2008 all rights reserved