Page is a not externally linkable
- Search Engines
-- Sitemaps, Meta Data, and robots.txt
---- How important is the Robots.txt file now?


gethan - 11:32 am on Jan 25, 2002 (gmt 0)


Just because you have a robots.txt file dosen't mean robots will obey it. Its like a "keep off the grass" sign - to some it's an invitation.

Here are two sure ways to stop abusive robots:

1) Get the ip addresses of the badbots and block at the router, firewall (or ipchains) - not an option on a hosting package.

2) mod_rewrite - block the user agents. See toolman's close to perfect badbot blocker here [webmasterworld.com]

Then there is the Scooter issue: is jeckyl & hyde bot - so maybe not a canidate for a permanent ban - here's [webmasterworld.com] a recent example. (I almost fell off my chair hearing that Alta's crawler support team actually did something ;))


Thread source:: http://www.webmasterworld.com/robots_txt/11.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com