Page is a not externally linkable
- Advertising
-- Pay Per Click Engines
---- Robots.txt


biggles - 2:40 am on Dec 2, 2002 (gmt 0)


I'm think about extending my robots.txt file to exclude mail harvesting agents due to the amount of email spam I've been getting. I'll also take the opportunity to exclude content harvesters and other bandwidth stealing agents.

I've looked at the WebmasterWorld robots.txt [webmasterworld.com ] for inspiration/guidance and I'm puzzeled by some of the exclusions such as WebmasterWorld Extractor

Also this file appears to differ from the comprehensive robots.txt file that only allows known "nice guy" spiders on the tutorial page - robots4.txt [searchengineworld.com ]. This features some different agents like BlackWidow.

Any suggestions please on a list of "must exclude" agents for a robots.txt file.

Thanks


Thread source:: http://www.webmasterworld.com/ppc/333.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com