homepage Welcome to WebmasterWorld Guest from 107.21.135.68
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Is a robots.txt file necessary?
if not trying to block bots?
Jakes Redding




msg:3316369
 1:11 am on Apr 20, 2007 (gmt 0)

Concerning the robots.txt file, this is by far the #1 error page in my stats, guess because the page doesn't exist. I haven't made any decision yet to block any bots so didn't create the page. Should I have this page nevertheless? If so, is this the correct code for not blocking any bots?

User-agent: *
Disallow:

Thanks for any advice.

 

jdMorgan




msg:3316383
 1:32 am on Apr 20, 2007 (gmt 0)

Yes, you should have this standard resource on your site -- and for the precise reason you mention. Robots will request it, and if it's not there, their requests may pollute your access and error logs to the point of marginal usability. The error log file should have only real, unexpected errors in it, not be filled with errors that are easy to prevent.

These errors may also skew the results of your 'stats' program, if you use one.

favicon.ico, w3c/p3p.xml, and labels.rdf are three more standard resources you might consider providing.

The code you posted looks fine. Put a blank line after the "Disallow:" line for maximum compatibility (Every "record" in a robots.txt file should be followed by a blank line, and there was one (European?) 'bot a few years ago that insisted on its presence, even for the last record).

There have also been unconfirmed reports that having a robots.txt file increases the number of pages spidered by MSNbot on your site. So far, not enough data has been collected for me to conclude that this is true.

Jim

Jakes Redding




msg:3316529
 5:52 am on Apr 20, 2007 (gmt 0)

Hello Jim,

Thanks for your advice. Yes, there have been a few favicon.ico errors too. Donít know why that is because favicon.ico should be linked on all my pages and resides in the root directory. Maybe I'm missing one or two. I'll check.

One other question if you will permit. A large number of errors (over 100 each in last two months) are requests for "mysite.com/index.htm/" and "mysite.com/defaultsite". Surely an error 404 page is served in these instances because the pages don't exist. But should I direct the bots not to look for them?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved