homepage Welcome to WebmasterWorld Guest from 54.226.192.202
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
new bot?
analysis.he.net/216.218.130.79
skirril

10+ Year Member



 
Msg#: 419 posted 4:50 pm on Feb 28, 2001 (gmt 0)

Comes from 216.218.130.79 (shows up as: analysis.he.net)
ua: Mozilla/4.0 (compatible;MSIE 5.5;Windows NT 5.0)

Does not seem to honor robots.txt; deep crawls;

up to 5 requests per second!!

ideas?

Skirril

 

skirril

10+ Year Member



 
Msg#: 419 posted 4:53 pm on Feb 28, 2001 (gmt 0)

Forgot:

also does NOT honor robots meta tag (deep-crawled a page I had set as "noindex,nofollow")

bobriggs

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 419 posted 4:52 pm on Mar 17, 2001 (gmt 0)

This one hit my site twice yesterday, and looking back in my logs had been around about the time you posted your message.

I do have pages that require authorization on this site. The funny thing is, it never got robots.txt, but it stays away from the directory that requires authorization. It "acts" like it has seen the robots.txt file, because it gets everything else on the site.

he.net is Hurricane Electric in Fremont, CA.

Anybody else seen this one?

BoneHeadicus

10+ Year Member



 
Msg#: 419 posted 4:59 pm on Mar 17, 2001 (gmt 0)

I was getting ready to nuke 'em in .htaccess. I thought they were messin' with me. Sorta glad to see it's not just me.

I wonder if its one of the mods here at WmW checking up to see if we're all doing our part in applying the techniques learned here.;)

bobriggs

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 419 posted 2:16 am on Apr 4, 2001 (gmt 0)

Well, h.e.'s back again. Nobody knows?

I just continue to let it rape my site without knowing whether to .htdisallow it or not.

Comes around about once every 2 months, just like google. Grabs everything.

volatilegx

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 419 posted 4:40 pm on Apr 5, 2001 (gmt 0)

I had a visit from cypress.he.net with the user agent Pizilla++ ver 2.45

Dan

digitalgirl

10+ Year Member



 
Msg#: 419 posted 9:06 pm on Apr 28, 2001 (gmt 0)

Can't one block this IP?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved