I've lurked for awhile and tried to find the answer to this question. If you want everything in your site indexed - does it even matter if you have a robot.txt file? I realize that you'll get a 404 error for each time, but do the spiders care?
Robots.txt is a good tool for controling what gets spidered and what doesnt. But it is also widely used for keeping the bad bots out altogether. It is often necasery to ban a bot that abuses your server by hogging bandwidth or fetching things you would rather it didn't.
Although some bots choose to ignore robots.txt altogether and when this happenes it is usualy a case of barring the bot from your server totaly by using your htaccess file. This is the internet equivilant of saying "you aint getting nothing here"
Thanks everyone for all the info. So I guess the final answer is - that the spiders don't penalize you for NOT having a robots.txt, but if you want to keep *most* spiders from accessing certain files - it's best to have one (and get rid of the 404's).
>> but if you want to keep *most* spiders from accessing certain files - it's best to have one
If you really want to protect certain files/directories or keep rogue bots out then you'll need to use something a bit stronger like .htaccess (Apache Server). Do a search here and you'll find plenty of reading to get you started.
And yes, I don't believe the spiders will penalize you for not having a robots.txt file and while I'm not completely sure, I can't think of any way having one could improve the spidering of your website other than telling the spider what not to include.