homepage Welcome to WebmasterWorld Guest from 54.211.235.255
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Punk Spider
caught this punk snooping from the clouds
incrediBILL




msg:4473574
 1:22 am on Jul 8, 2012 (gmt 0)

184.106.80.222

"Punk Spider/PunkSPIDER-v0.1"

robots.txt: YES

 

wilderness




msg:4473578
 2:27 am on Jul 8, 2012 (gmt 0)

"Punk Spider"

Arent they all ;)

lucy24




msg:4473580
 3:01 am on Jul 8, 2012 (gmt 0)

v0.1

Oi! Save your alpha testing for your friends' sites! Come back when you're ready with v. ... well, OK, at least 0.8.

dstiles




msg:4473673
 6:29 pm on Jul 8, 2012 (gmt 0)

Server range (Rackspace). Already blocked.

incrediBILL




msg:4473678
 6:42 pm on Jul 8, 2012 (gmt 0)

Server range (Rackspace). Already blocked.


That too.

I attempt to be polite with robots.txt set aside as a special case in the firewall so the dynamic robots.txt code will serve up permissions to anyone that asks. However, if you ask for robots.txt and are denied, and then request any other page from that user agent or IP address, you're also denied since the script enforces the robots.txt rules.

The reason I track the IP is some smart ass started asking for robots.txt using one user agent to test the waters then switched the user agent when asking for pages so I started tracking the IPs making the robots.txt requests if they're denied :)

The data center blocking, which includes rackspace, applies to all other files :)

It's complicated yet so simple.

dstiles




msg:4473919
 9:21 pm on Jul 9, 2012 (gmt 0)

Your system is far more sophisticated than mine. :)

My IIS system (sans htaccess) can only intercept IPs and headers on ASP page access. Far too late for me to change it now. :(

incrediBILL




msg:4477747
 7:23 am on Jul 21, 2012 (gmt 0)

That PUNK came from a new IP today: 198.101.170.228

Wondering if they're in the cloud.

dstiles




msg:4477914
 10:14 pm on Jul 21, 2012 (gmt 0)

Don't care. Rackspace: blocked 198.101.128/17 :)

Does rackspace operate a cloud? If so, where? What IPs?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved