Forum Moderators: phranque

Message Too Old, No Replies

robots.txt

logs show the SEs can't see it - but it's there

         

lorax

1:22 am on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've seen a significant number of errors lately that indicate Yahoo! and Inktomi (among others) cannot find my robots.txt file.

I can see it fine. I can call it up in my browser fine. I've tested it with a robots.txt validator and it comes clean with flying colors.

I even check my htaccess file to see if I've banned the IP or user agents of the SEs and I haven't. I can't for the life of me think of why I'm seeing these errors in my logs but they're there.

Might someone have an idea about why this might be happening?

leadegroot

7:00 am on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What sort of errors do you see?

keyplyr

7:04 am on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One reason might be that these bots are following incoming links to both www.domain.com and domain.com. If you think this might be the case and need help fixing this issue, try a webmasterworld search for the mod_rewrite code that will compensate for that, most likely in the Apache forum.

<added>
If the site your are speaking of is the email address in your profile, I was able to access both www.domain.com/robots.txt and domain.com/robots.txt, so it must be another reason.

lorax

11:26 am on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for the replies.

>> What sort of errors do you see?

File not found.

>> domain.com vs www.domain.com

Not the site in my profile but I can get to the file through both.

leadegroot

11:55 am on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So you see a 404 in your logs?
Try checking the headers.
Googling for
Check Server Headers Tool
will return tools that will do it.
Its really unlikely but possible that you are returning the content and a 404, so it looks fine to you.

But if not, hopefully it is transient - I often see weird things in my logs. 404s and 301s on pages that I know are not returning that code. It doesn't last.

lorax

12:16 pm on Oct 2, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tried that and it returns fine. I checked my logs this morning and out of the 27 times that robots.txt was requested 7 were 404d. It just doesn't make sense. I've put in a trouble ticket with the host just to see if they might have some insight.