Forum Moderators: goodroi
Furthermore, "Allow" is only supported by Google and Yahoo!, and possibly a few other search engines, and so cannot be used with "User-agent: *".
Google is promoting their Webmaster Tools, and so has made somewhat of a fragmented mess of their robots.txt information pages, but this page [google.com] describes their proprietary "Allow" directive.
The similar page at Yahoo! is here [help.yahoo.com], showing that they support that directive as well.
MSN/Windows Live Search does not support "Allow" -- or at least, it's not mentioned on their robots.txt page here [search.msn.com].
Be sure to check the robots.txt-related pages at all of the search engines that you are concerned with -- It's probably fair to say that none of them support the same feature set as any other. If you assume that they work the same with any particular directive and get it wrong, then you could make a mess of your search rankings.
Jim
Everything not explicitly disallowed is considered fair game to retrieve.
if you want to allow only index.html for all bots:
User-agent: *
Disallow: /
allow: index.html
here are a couple of important references.
web server admin quide:
[robotstxt.org...]
the protocol:
[robotstxt.org...]