Page is a not externally linkable
jwolthuis - 4:36 pm on Dec 31, 2006 (gmt 0)
Internet Archive Wayback Machine: Takes a periodic snapshot of your site, making it available for browse/search years after pages may have been taken down. To block it, put these lines in your robots.txt file: User-agent: ia_archiver Google Images, Yahoo Image Search, PicSearch: These crawlers look for images on your site, make a best-guess as to their content, and make it easy for everyone to view or download. Depending on whether you think this is good or bad, you may want to block them. Add these lines to your robots.txt file: User-agent: Googlebot-Image User-agent: Yahoo-MMCrawler User-agent: psbot
Besides the "normal" search engines, there are specialty search engine you may want to allow or block. At a minimum, it's good to be aware of them.
Disallow: /
Disallow: /
Disallow: /
Disallow: /