Daily I find 2-3 bots that I add to my blocklist after seeing them looking around and then looking them up on Google and seeing references to spam/harvesting.
WHY do these bots not fake MSIE's User-agent? That would make me unable to block them.
vincevincevince
11:07 am on Sep 6, 2007 (gmt 0)
By the time you found them, they'd got what they wanted
mikomido
11:21 am on Sep 6, 2007 (gmt 0)
Well, they can't ever come back (and expect me to return content). So why not fake MSIE's UA? It would fool me for sure.
vincevincevince
11:23 am on Sep 6, 2007 (gmt 0)
How big is the web? Even if a few thousand sites block them, they still have plenty more content to scrape.
mikomido
11:43 am on Sep 6, 2007 (gmt 0)
Again... why expose themselves at all? It just makes no sense.
vincevincevince
1:09 pm on Sep 6, 2007 (gmt 0)
My thoughts: 1) They are too lazy / couldn't be bothered to read the manual 2) By allowing you to stop them easily it makes you less likely to cause a fuss about them to their ISP, etc.
jatar_k
1:19 pm on Sep 6, 2007 (gmt 0)
it makes it easy for them to see who actively blocks too. They can come back later with an MSIE agent if needed.
plus, 'rogue' is not the same thing as 'bad'.
A rogue bot could be a bot from a good source that is having a problem, I think you are talking mainly about 'bad' bots
mikomido
1:42 pm on Sep 6, 2007 (gmt 0)
Based on the User-agent I can see if it's a known bot with "problems". I don't want any bots other than large search engines.