Forum Moderators: open

Message Too Old, No Replies

Protoype

Cloaked somethingorother from .us.ibm.com

         

Pfui

10:36 pm on Mar 13, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



bi01p1.nc.us.ibm.com
Googlebot/Nutch-1.0 (Prototype; http://en.wikipedia.org/wiki/Web_crawler; donotreply at prototype dot com)

robots.txt? Yes, but promptly ignored.

The UA string is a faked-up mess. Also, FWIW, "prototype dot com" forwards to a plastics manufacturing company.

IBM Host IP = 129.33.49.251
Range: 129.33.0.0 - 129.33.255.255
CIDR: 129.33.0.0/16

Search results suggest "bi01p1.nc.us.ibm.com" (ditto "129.33.49.251") is multiple-employee access, everything from posts to chats. Even banned in at least one place for spamming.

P.S. to IBM employee(s) running bots and/or spamming...

You're fired.

Pfui

6:41 am on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW... Exact same Host/IP, UA, behavior on 03-19 and 04-05. Probably more often, but those are from quick log-tail glances.

tangor

6:58 am on Apr 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks... haven't seen this one yet, but I nuke anything "nutch" or "nu_tch" as it is. Dang few of those have ever respected robots.txt!

Pfui

11:46 pm on Apr 25, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



32.97.110.6*
Googlebot/Nutch-1.0 (Prototype; http://en.wikipedia.org/wiki/Web_crawler; donotreply at prototype dot com)

robots.txt? Yes

Connected to .ibm.com crawl in OP? Wouldn't surprise me. Wikipedia-related? Hmm. May simply be spoofed. Or not.

blend27

3:09 pm on Apr 26, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Googlebot/Nutch-1.0 (Prototype; http: //en.wikipedia.org/wiki/Web_crawler; donotreply at prototype dot com)


That UA would fail on at least 7 Rules my sites are running.