homepage Welcome to WebmasterWorld Guest from 54.226.10.234
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt necessary
Is the robots.txt file absolutely necessary
camchoice

5+ Year Member



 
Msg#: 945 posted 7:21 am on Jul 6, 2006 (gmt 0)

I found many errors in the crawl check from google (you can do this via sitemaps).

I had a 404 redirect back to my main page and because I did not have a robots.txt file googlebot took the main index page as robot.txt file because its the 404 redirect main index page. Ofcourse this will return an error from googlebot.

I have now put a totally empty robots.txt file in the root. But is it correct to leave it totally empty, or should I add a rule that all pages may be crawled?

 

Teacake23

5+ Year Member



 
Msg#: 945 posted 1:18 pm on Jul 14, 2006 (gmt 0)

You could simply add:

User-agent: *
Disallow:

This will allow all robots full access, Googlebot, Slurp and MSN have no problem with this.

MisterT

5+ Year Member



 
Msg#: 945 posted 12:31 am on Jul 20, 2006 (gmt 0)


Isn't having no robots.txt at all the same result as having this:

User-agent: *
Disallow:

Jim Catanich

5+ Year Member



 
Msg#: 945 posted 9:39 pm on Jul 26, 2006 (gmt 0)

Although the robots.txt file is not needed, it is a good idea to use it. My site's robots.txt file is:

# Robots.txt file created by 7/21/06
# For domain: [catanich.com...]

# All other robots will spider the domain
User-agent: *
Disallow: */_vti_cnf/ #created directories by the dev tool
Disallow: /_common/ #common source code
Disallow: /_holdit/ #a junk folder
Disallow: /_private/ #
Disallow: /_ScriptLibrary/ #common script folder
Disallow: /rLinks/ #reciprocal link folder

But when I ftp my site upto the server, the "_vti_cnf/" directories are sent too. This means that the SE's will index these directories as well (thank you Python Site Map) and create a great deal of Google errors.

In addition, I don't want my source code directories to be indexed.

And just for information purposes, by adding "Disallow: /rLinks/" to the robots.txt, all my reciprocal links are blocked from being indexing. So no PR bleed.

So as you can see, there are real reasons that you use the robots.txt file.

Jim Catanich

AjiNIMC

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 945 posted 6:57 pm on Aug 6, 2006 (gmt 0)

I have seen websites doing good even without a robots.txt

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved