Forum Moderators: open
208.115.111.*** - - [28/Jul/2008:14:55:06 -0400] "GET /robots.txt HTTP/1.0" 301 243 "-" "DotBot/1.0.1 (http://www.dotnetdotcom.org/, crawler@dotnetdotcom.org)"
Now banned.
They have an info page which shows how to deny their UA via robots.txt standard, but since they don't obey it, what's the point? They also do not explain what the data is being used for.
Took a 1000 pages a day for 8 days in August on one of my sites, would be nice if they were a little more open about what they are planning.
Ian,
During my time here at Webmaster World and in conjunction with my websites I've learned to accept and recognize that reputable bots and/or SE's do NOT utilize colo's for their crawling.
Harvesting on the other hand is entirely different issue (pun).
Don