Forum Moderators: open

Message Too Old, No Replies

RatePoint

Scraping About Us pages

         

caribguy

4:53 am on Apr 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I noticed that some WebmasterWorld members are happily using their services, but some of us might not be too excited about having our About Us info scraped:

example1.com 75.125.229.nnn - - [30/Mar/2010:00:00:00 -0000] "GET /about-us HTTP/1.1" 200 12121 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)"

example2.com 75.125.226.nnn - - [05/Apr/2010:00:00:0 -0000] "GET /about-us HTTP/1.1" 200 12345 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)"

network:IP-Network:75.125.229.136/29
network:IP-Network-Block:75.125.229.136 - 75.125.229.143

network:IP-Network:75.125.226.152/29
network:IP-Network-Block:75.125.226.152 - 75.125.226.159

network:Organization-Name:RatePoint, Inc.
network:Organization-City:Westwood
network:Organization-State:MA
network:Organization-Zip:02090

In the past month, I've seen that UA only used once by something that resembles a human visitor.

keyplyr

6:01 pm on Apr 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Blocking The Planet (as many of us have learned to do) basically takes care of RatePoint.

Planet range:
75.125.0.0 - 75.125.255.255
75.125.0.0/16

caribguy

6:44 pm on Apr 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't forget .126

# 75.125/6 Colo Space incl EVServers
RewriteCond %{REMOTE_ADDR} ^(75\.12[56]\.) [OR]

keyplyr

7:08 pm on Apr 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Don't use mod_rewrite much anymore for IP range blocking. I cover Softlayer ranges with mod_access:

deny from 75.126.0.0/16