homepage Welcome to WebmasterWorld Guest from 54.205.197.66
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt necessary
Is the robots.txt file absolutely necessary
camchoice




msg:1529504
 7:21 am on Jul 6, 2006 (gmt 0)

I found many errors in the crawl check from google (you can do this via sitemaps).

I had a 404 redirect back to my main page and because I did not have a robots.txt file googlebot took the main index page as robot.txt file because its the 404 redirect main index page. Ofcourse this will return an error from googlebot.

I have now put a totally empty robots.txt file in the root. But is it correct to leave it totally empty, or should I add a rule that all pages may be crawled?

 

Teacake23




msg:3007117
 1:18 pm on Jul 14, 2006 (gmt 0)

You could simply add:

User-agent: *
Disallow:

This will allow all robots full access, Googlebot, Slurp and MSN have no problem with this.

MisterT




msg:3015259
 12:31 am on Jul 20, 2006 (gmt 0)


Isn't having no robots.txt at all the same result as having this:

User-agent: *
Disallow:

Jim Catanich




msg:3023230
 9:39 pm on Jul 26, 2006 (gmt 0)

Although the robots.txt file is not needed, it is a good idea to use it. My site's robots.txt file is:

# Robots.txt file created by 7/21/06
# For domain: [catanich.com...]

# All other robots will spider the domain
User-agent: *
Disallow: */_vti_cnf/ #created directories by the dev tool
Disallow: /_common/ #common source code
Disallow: /_holdit/ #a junk folder
Disallow: /_private/ #
Disallow: /_ScriptLibrary/ #common script folder
Disallow: /rLinks/ #reciprocal link folder

But when I ftp my site upto the server, the "_vti_cnf/" directories are sent too. This means that the SE's will index these directories as well (thank you Python Site Map) and create a great deal of Google errors.

In addition, I don't want my source code directories to be indexed.

And just for information purposes, by adding "Disallow: /rLinks/" to the robots.txt, all my reciprocal links are blocked from being indexing. So no PR bleed.

So as you can see, there are real reasons that you use the robots.txt file.

Jim Catanich

AjiNIMC




msg:3036455
 6:57 pm on Aug 6, 2006 (gmt 0)

I have seen websites doing good even without a robots.txt

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved