Forum Moderators: open

Message Too Old, No Replies

Pixray? Ixnay .

         

Pfui

2:48 pm on Nov 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Cloaks crawling; asks for, then ignores robots.txt; badly coded (note space pre last slash) UA; hits dynamic files:

THIS WEEK:

node-176-9-31-202.cluster.eu.webcrawler.pixray.com
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 /Nutch-1.2

robots.txt? Yes BUT immediately ignored.

LAST WEEK:

static.202.31.9.176.clients.your-server.de
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 /Nutch-1.2

robots.txt? Yes BUT immediately ignored.

IP for BOTH of the above = 176.9.31.202 [projecthoneypot.org...]
(Surprise, surprise: 176.9.0.0/16 = HETZNER)

Lots of auto-block triggers in the preceding info for most of us but some may be deceived by the robots.txt request. Also note that from the PHP data, Nutch isn't the only UA:

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101

Pfui

1:58 am on Nov 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Variation on its Nutch-named UA:

node-188-40-65-130.cluster.eu.webcrawler.pixray.com
nutch/1.2 (nutch)

robots.txt? Yes BUT immediately ignored. Again.

IP = 188.40.65.130 [projecthoneypot.org...]
188.40.0.0/16 = HETZNER. Again. [robtex.com...]