caribguy

msg:3796223 | 8:06 pm on Nov 28, 2008 (gmt 0) |
Seen 3 times on 2 sites since Nov 22 - thanks!
|
phred

msg:3796395 | 3:03 am on Nov 29, 2008 (gmt 0) |
| You can easily thwart them blocking Peer1: OrgName: Peer 1 Dedicated Hosting NetRange: 69.0.128.0 - 69.0.255.255 CIDR: 69.0.128.0/17 |
| Bill, Any reason to not block all of Peer 1? 64.29.16.0/20 64.45.0.0/18 64.65.0.0/18 64.77.0.0/17 64.224.0.0/14 64.239.0.0/17 69.0.128.0/17 66.33.0.0/17 66.36.96.0/20 66.111.64.0/19 66.132.128.0/17 66.148.0.0/18 66.223.0.0/17 66.234.0.0/20 207.21.192.0/18 207.159.128.0/19 207.198.64.0/18 209.15.0.0/16 209.25.128.0/17 209.35.0.0/16 209.95.96.0/19 209.196.128.0/18 209.203.224.0/19 209.213.96.0/19 216.25.0.0/17 216.65.0.0/17 216.87.0.0/19 216.87.208.0/20 216.122.0.0/16 216.150.0.0/19 216.152.128.0/20 216.157.0.0/18 216.157.64.0/19 216.157.96.0/20 216.247.0.0/16
|
incrediBILL

msg:3796397 | 3:06 am on Nov 29, 2008 (gmt 0) |
| Any reason to not block all of Peer 1? |
| Considering I host on Peer1/ServerBeach, I have to tread lightly with that.
|
phred

msg:3796407 | 3:20 am on Nov 29, 2008 (gmt 0) |
Ooops! Of course didn’t mean to put you in awkward situation! When blocking a server IP range from a server hosting organization I tend to block all similar named ranges from that organization - on the basis they are also probably used for servers. Cheers, Phred
|
incrediBILL

msg:3796426 | 4:34 am on Nov 29, 2008 (gmt 0) |
| I tend to block all similar named ranges from that organization |
| Same here in most cases. No need to leave gaping holes in the fence.
|
incrediBILL

msg:3799737 | 5:07 am on Dec 4, 2008 (gmt 0) |
Follow up... Got a message from someone at WordTracker saying they don't crawl. They claim it's a lateral search tool that looks for keywords on all of the pages returned from the original search. Sounds like quibbling over semantics about what constitutes a crawl or not because allowing a SE to crawl a site doesn't mean giving authorization for any other automated task to access pages resulting from that crawl and subsequent search, then crawling those pages yet again without permission. But that's a different argument for a different day. Anyway, they claim if you write to them they'll remove your site from their searches. IMO, honoring robots.txt would certainly be a lot simpler for all involved.
|
blend27

msg:3800157 | 7:03 pm on Dec 4, 2008 (gmt 0) |
We had a similar situation on one of the sites few month ago and wrote to WordTracker. They replied that their customer was doing a research using their services and they had no control over it. Few of the requests from it was made to an URI that contained no WWW. in it and contained "/..." as well. The only place that URI was reference ever was in MSN SERP: "host.tld/dir/page.h....". Attempts like that dated back to April of 2007. Another IP they have used on several occasions is 64.65.13.36. REQUEST HEADERS from 66.132.220.238: Referer: http://www.domain.tld Connection: close Host: www.domain.tld User-Agent: POE-Component-Client-HTTP/0.65 (perl; N; POE; en; rv:0.650000) ------------------------ request_method: GET server_protocol: HTTP/1.0 Notice that the there is no trailing forward slash on the referer.
|
|