Welcome to WebmasterWorld Guest from 54.167.83.224

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

Bots Picking up weird files

Google, MSN mainly, but Slurp sometimes too

     
6:49 pm on Dec 27, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 13, 2004
posts:67
votes: 0


Googlebot and MSNbot have both been crawling my site looking for files that do not exist. I have been hoping that they would give up after sometime but they haven't. I have done a redirect for one file, but it still isn't taking. I wish I had more to say, but that is all I have for now.

I'll update with better examples.

7:08 pm on Dec 27, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Have you seen any with almost random-letter filenames?

or any with 404probe as a part of the URL?

7:22 pm on Dec 27, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 30, 2004
posts:712
votes: 0


Google, MSN mainly, but Slurp sometimes too

That looks as if there are links pointing to those files. And as long as those links exist, bots won't give up asking.
7:27 pm on Dec 27, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 13, 2004
posts:67
votes: 0


Yeah, I am thinking there are links out there, but I can't find them. Any tips on how to locate these links? I promise as soon as I get another weird one I will post it here.
7:42 pm on Dec 27, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 30, 2004
posts:712
votes: 0


You could try searching for www.example.com strangefilename.html
2:26 pm on Dec 28, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 13, 2004
posts:67
votes: 0


Perfect, I found one BIG reason for my problems. Some research team in Mannheim was using my robots.txt file for research purposes without asking! They put the links online ... but incorrectly! Geez!

Okay ... but then I have things like this ...

They were trying to access http://www.example.com/Folder name/'name of client' from (Direct Request)

They were using the following browser:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; YPC 3.0.0; .NET CLR 1.0.3705)

First, notice that there are spaces, not %20 which is what the original file has (and the original file name doesn't look anything like this).

Any ideas?

2:41 pm on Jan 6, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 13, 2004
posts:67
votes: 0


A visitor to your site just got a 404 error.

They were trying to access http://www.example.com/services_custom_reporting.htm from (Direct Request)

They were using the following browser:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
---------------------------------------------
This page has never existed, I can't find a link to it on the web by search, anyone know how I can get a hold of where Googlebot is getting this link from?

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members