Robots.txt necessary

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt necessary

Is the robots.txt file absolutely necessary

camchoice

7:21 am on Jul 6, 2006 (gmt 0)

I found many errors in the crawl check from google (you can do this via sitemaps).

I had a 404 redirect back to my main page and because I did not have a robots.txt file googlebot took the main index page as robot.txt file because its the 404 redirect main index page. Ofcourse this will return an error from googlebot.

I have now put a totally empty robots.txt file in the root. But is it correct to leave it totally empty, or should I add a rule that all pages may be crawled?

Teacake23

1:18 pm on Jul 14, 2006 (gmt 0)

You could simply add:

User-agent: *
Disallow:

This will allow all robots full access, Googlebot, Slurp and MSN have no problem with this.

MisterT

12:31 am on Jul 20, 2006 (gmt 0)

Isn't having no robots.txt at all the same result as having this:

User-agent: *
Disallow:

Jim Catanich

9:39 pm on Jul 26, 2006 (gmt 0)

Although the robots.txt file is not needed, it is a good idea to use it. My site's robots.txt file is:

# Robots.txt file created by 7/21/06
# For domain: [catanich.com...]

# All other robots will spider the domain
User-agent: *
Disallow: */_vti_cnf/ #created directories by the dev tool
Disallow: /_common/ #common source code
Disallow: /_holdit/ #a junk folder
Disallow: /_private/ #
Disallow: /_ScriptLibrary/ #common script folder
Disallow: /rLinks/ #reciprocal link folder

But when I ftp my site upto the server, the "_vti_cnf/" directories are sent too. This means that the SE's will index these directories as well (thank you Python Site Map) and create a great deal of Google errors.

In addition, I don't want my source code directories to be indexed.

And just for information purposes, by adding "Disallow: /rLinks/" to the robots.txt, all my reciprocal links are blocked from being indexing. So no PR bleed.

So as you can see, there are real reasons that you use the robots.txt file.

Jim Catanich

AjiNIMC

6:57 pm on Aug 6, 2006 (gmt 0)

I have seen websites doing good even without a robots.txt