Forum Moderators: goodroi
I've lurked for awhile and tried to find the answer to this question. If you want everything in your site indexed - does it even matter if you have a robot.txt file? I realize that you'll get a 404 error for each time, but do the spiders care?
Thanks for helping out a webmaster-wannabe!
Larry
Although some bots choose to ignore robots.txt altogether and when this happenes it is usualy a case of barring the bot from your server totaly by using your htaccess file. This is the internet equivilant of saying "you aint getting nothing here"
Hope this helps.
... it is also widely used for keeping the bad bots out altogether.
Only if the bot requests the robots.txt file and not all of them do.
The robots.txt file is good for those bots that do request it and in those cases you can tell them which directories to stay out of - like an image directory or your cgi directory.
>> but if you want to keep *most* spiders from accessing certain files - it's best to have one
If you really want to protect certain files/directories or keep rogue bots out then you'll need to use something a bit stronger like .htaccess (Apache Server). Do a search here and you'll find plenty of reading to get you started.
And yes, I don't believe the spiders will penalize you for not having a robots.txt file and while I'm not completely sure, I can't think of any way having one could improve the spidering of your website other than telling the spider what not to include.