homepage Welcome to WebmasterWorld Guest from 54.166.65.9
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Bots Picking up weird files
Google, MSN mainly, but Slurp sometimes too
Kate82




msg:889154
 6:49 pm on Dec 27, 2005 (gmt 0)

Googlebot and MSNbot have both been crawling my site looking for files that do not exist. I have been hoping that they would give up after sometime but they haven't. I have done a redirect for one file, but it still isn't taking. I wish I had more to say, but that is all I have for now.

I'll update with better examples.

 

g1smd




msg:889155
 7:08 pm on Dec 27, 2005 (gmt 0)

Have you seen any with almost random-letter filenames?

or any with 404probe as a part of the URL?

Span




msg:889156
 7:22 pm on Dec 27, 2005 (gmt 0)

Google, MSN mainly, but Slurp sometimes too

That looks as if there are links pointing to those files. And as long as those links exist, bots won't give up asking.

Kate82




msg:889157
 7:27 pm on Dec 27, 2005 (gmt 0)

Yeah, I am thinking there are links out there, but I can't find them. Any tips on how to locate these links? I promise as soon as I get another weird one I will post it here.

Span




msg:889158
 7:42 pm on Dec 27, 2005 (gmt 0)

You could try searching for www.example.com strangefilename.html

Kate82




msg:889159
 2:26 pm on Dec 28, 2005 (gmt 0)

Perfect, I found one BIG reason for my problems. Some research team in Mannheim was using my robots.txt file for research purposes without asking! They put the links online ... but incorrectly! Geez!

Okay ... but then I have things like this ...

They were trying to access http://www.example.com/Folder name/'name of client' from (Direct Request)

They were using the following browser:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; YPC 3.0.0; .NET CLR 1.0.3705)

First, notice that there are spaces, not %20 which is what the original file has (and the original file name doesn't look anything like this).

Any ideas?

Kate82




msg:889160
 2:41 pm on Jan 6, 2006 (gmt 0)

A visitor to your site just got a 404 error.

They were trying to access http://www.example.com/services_custom_reporting.htm from (Direct Request)

They were using the following browser:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
---------------------------------------------
This page has never existed, I can't find a link to it on the web by search, anyone know how I can get a hold of where Googlebot is getting this link from?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved