| 4:53 pm on Feb 28, 2001 (gmt 0)|
also does NOT honor robots meta tag (deep-crawled a page I had set as "noindex,nofollow")
| 4:52 pm on Mar 17, 2001 (gmt 0)|
This one hit my site twice yesterday, and looking back in my logs had been around about the time you posted your message.
I do have pages that require authorization on this site. The funny thing is, it never got robots.txt, but it stays away from the directory that requires authorization. It "acts" like it has seen the robots.txt file, because it gets everything else on the site.
he.net is Hurricane Electric in Fremont, CA.
Anybody else seen this one?
| 4:59 pm on Mar 17, 2001 (gmt 0)|
I was getting ready to nuke 'em in .htaccess. I thought they were messin' with me. Sorta glad to see it's not just me.
I wonder if its one of the mods here at WmW checking up to see if we're all doing our part in applying the techniques learned here.;)
| 2:16 am on Apr 4, 2001 (gmt 0)|
Well, h.e.'s back again. Nobody knows?
I just continue to let it rape my site without knowing whether to .htdisallow it or not.
Comes around about once every 2 months, just like google. Grabs everything.
| 4:40 pm on Apr 5, 2001 (gmt 0)|
I had a visit from cypress.he.net with the user agent Pizilla++ ver 2.45
| 9:06 pm on Apr 28, 2001 (gmt 0)|
Can't one block this IP?