I got this back from somebody else here.. "There was (maybe is) an Apache derived server returning 403 instead of 404. This was a big enough problem for Google to switch their behaviour and crawl domains where /robots.txt returned 403. That was some time ago, but if it's still the case then 403 for /robots.txt should be no problem. "
I apologize for putting this question in different forums but it was really worrying me that the site might not get listed because of the 403 errors... I uploaded a blank robots.txt file just in case.
... And that response is incorrect and you should find out why. Just because Google has a work-around, doesn't mean that others do. And maybe their work-around will be accidentally removed some day... Unlikely, but do you want to bet your business?
I strongly suggest you find out why your server is incorrectly returning a 403 server response and fix it.
If your site does not meet HTTP specifications, it is a disaster waiting to happen. 403 has a meaning and 404 has a meaning, and they are definitely not interchangeable! Also, put up at least a blank robots.txt file and avoid all those 404's in your logs!
Thanks again JD... I did put up a blank robots.txt file early this morning when I started to question what was happening.. this is a site I moved off my main servers to another host to satisfy the search engines, it is a virtual host, but I have "tested" many hosting services and never saw this before. The hosting company told me this... For the www.mydomain.com/ as being called the "control".
"This is because the folder is control. If you look for the same file in an uncontrolled folder you will get the 404 error. www.mydomain.com/images/robots.txt "
And I completely agree as to an apple is an apple and an orange is an orange in the configuration of the server and the errors being generated.
The operating system is: Linux 2.2.19-6.2.11 The web server is: Apache/1.3.9 (Unix) (Red Hat/Linux) PHP/4.3.1