Forum Moderators: goodroi
The correct way would be to exclude everything else in the robots.txt file and add the "noindex" and "nofollow" meta tags to all the other pages in the site (as a safety measure in case robots.txt is ignored)and just a "nofollow" tag to the index.html page.
Sample of robots.txt:
User-agent: *
Disallow: /images
Disallow: /cgi-bin/
Disallow: /each-and-every-directory
Disallow: /page-name.html
Disallow: /another-page-name.html
Disallow: /still-another-page-name.html
You would need to list each page in the root directory and each directory.
Robot Manager is an excellent tool to use if you have trouble writing the robots.txt file by hand (hope that's OK to list that resource). There is also an excellent validator here at SEW > [searchengineworld.com...]
Remember robots/crawlers/spiders have been known to ignore all these safeguards. If you have something sensitive you don't want to see in a SE, password protect the directory and/or page.
Hope that helps.