Page is a not externally linkable
hybrid6studios - 8:45 am on Jan 11, 2007 (gmt 0)
Many of these will actually go to Robots.txt to see what you are trying to hide or protect, and go straight to the restricted content. For this reason I use a dynamic robots.txt page. Through proper use of .htaccess and mod_rewrite, every time my server calls up robots.txt, it invisibly serves a PHP page (although it looks the same to the viewer) and it detects what bot or browser is viewing the page. For search engine spiders I serve the real Robots.txt content for proper indexing, and for all others I simply disallow everything.
Be aware that not every spider obeys Robots.txt. There are some nasty bots out there, including but not limited to: 1) spambots that harvest email addresses from your contact forms or guestbook pages; 2) scrapers that scrpe your site for free content to be used in their spammy doorway pages; 3) downloader programs that suck your bandwidth by downloading your entire site; 4) programs that are out on the web looking for copyright infringements so they can sue people; 5) viruses & worms; 6) data mining programs; 7) hackers; 8) DDOS attacks, etc.