Had three visits in the last week. Grabs about a third of my pages including pages linked to form action= and located in disallowed directories in robots.txt.
And of course ignores the robots.txt file.
Any ideas? Thanks
edit_g
10:14 am on Aug 27, 2003 (gmt 0)
The IP 66.227.29.211 resolves to a broadband ISP in Northern Colorado.
Staffa
3:26 pm on Aug 27, 2003 (gmt 0)
Thanks edit_g, I know I did a whois already.
I just don't understand why an ISP would crawl part of my site and in an uncivilized fashion at that.
Any other ideas? Much appreciated.
edit_g
3:40 pm on Aug 27, 2003 (gmt 0)
Could be some sort of testing? Hard to know really.
Send them an email and ask them, and remind them what robots.txt is for while you're at it.
Staffa
5:47 pm on Aug 27, 2003 (gmt 0)
Thanks again, good idea I'll do just that.
engine
5:48 pm on Aug 27, 2003 (gmt 0)
Might not be the ISP - likely to be one of their customers testing something or perhaps harvesting.