Forum Moderators: goodroi
I had a 404 redirect back to my main page and because I did not have a robots.txt file googlebot took the main index page as robot.txt file because its the 404 redirect main index page. Ofcourse this will return an error from googlebot.
I have now put a totally empty robots.txt file in the root. But is it correct to leave it totally empty, or should I add a rule that all pages may be crawled?
# Robots.txt file created by 7/21/06
# For domain: [catanich.com...]
# All other robots will spider the domain
User-agent: *
Disallow: */_vti_cnf/ #created directories by the dev tool
Disallow: /_common/ #common source code
Disallow: /_holdit/ #a junk folder
Disallow: /_private/ #
Disallow: /_ScriptLibrary/ #common script folder
Disallow: /rLinks/ #reciprocal link folder
But when I ftp my site upto the server, the "_vti_cnf/" directories are sent too. This means that the SE's will index these directories as well (thank you Python Site Map) and create a great deal of Google errors.
In addition, I don't want my source code directories to be indexed.
And just for information purposes, by adding "Disallow: /rLinks/" to the robots.txt, all my reciprocal links are blocked from being indexing. So no PR bleed.
So as you can see, there are real reasons that you use the robots.txt file.
Jim Catanich