Forum Moderators: open

Message Too Old, No Replies

WebTarantula Tools

They have their own nasty browser, toolbar, and bots

         

Angonasec

4:52 am on Oct 10, 2014 (gmt 0)



New to me.

Q/
Web tarantula is a service, that helps web masters researching and analyzing their competitor websites and gives them directions how...
/Q

WZ Communications Inc.
208.88.224.0 - 208.88.227.255 208.88.224.0/22

208.88.225.147 - - [09/Oct/2014] "GET / HTTP/1.1" 200 1107 "-" "WebTarantula.com Crawler"

Blocked on UA and this IP 208.88.225.

wilderness

9:04 am on Oct 11, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



WZCOMM-US 199.101.132.0 - 199.101.135.255 199.101.132.0/22
WZCOMM-US 199.80.52.0 - 199.80.55.255 199.80.52.0/22
WZCOMM-US 208.88.224.0 - 208.88.227.255 208.88.224.0/22
WZCOMM-US 208.94.232.0 - 208.94.235.255 208.94.232.0/22
WZCOMM-US 74.117.176.0 - 74.117.183.255 74.117.176.0/21
WZCOMM-IPV6-US 2607:FBE0:: - 2607:FBE0:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF

Angonasec

12:55 am on Oct 12, 2014 (gmt 0)



Thanks Don, I didn't block all Wzcomm because, hopefully, not all their customers will be using nefarious tools.

Do you block all Wzcomm ranges?

wilderness

1:43 am on Oct 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the only I've currently denied is 74.117.176.0/21 from May 2012 and a Webzilla request with UA "PycURL/7.19.7"

FWIW, I don't block all the ranges that I list on a "Related Network", rather wait until each IP makes an attempt. After 2 or 3 ranges, than I consider adding the complete set.

helps web masters researching and analyzing their competitor websites


There are NOT any other widget sites that approach or are even similar to mine, thus my situation is quite unique.

lucy24

2:16 am on Oct 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A service is only as good as its robot. I see this over and over again. People pay money for services that will look at other sites and give you such-and-such information. But the information is only meaningful if the robot is allowed access to the widest possible range of sites. If it crawls from a bad IP range, or it habitually ignores robots.txt, then that crawler will be blocked by the very webmasters you're paying to get information about.

I just locked the door on never mind which anti-plagiarism organization. I hate to do it, because it's a worthy cause on principle-- but they never so much as asked for robots.txt, the UA gives no contact information, and the site itself makes it more-or-less impossible to make contact with any human other than a sales rep. No "about our crawler" page, of course. Grr.

Angonasec

2:18 am on Oct 12, 2014 (gmt 0)



Q/
FWIW, I don't block all the ranges that I list on a "Related Network", rather wait until each IP makes an attempt. After 2 or 3 ranges, than I consider adding the complete set.
/Q

Ah that's reassuring, I do the same :)

Angonasec

2:23 am on Oct 12, 2014 (gmt 0)



Good points Lucy.

Anti-plagiarism is a noble cause, but when they use a bot based in China, bells ring, and the portcullis descends smartly.

I haven't visisted the WebTarantula website, far too greasy a name to risk it. Their quote I lifted from the Bing snippet: Sufficient unto the day.

dstiles

6:42 pm on Oct 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lucy - sadly there are very few webmasters that do more than provide a rudimentary robots.txt, so discovery bots get free range of enough sites to make reasonable assessments.

As to plagiarism bots - years ago they were one of the first classes I blocked.

lucy24

7:11 pm on Oct 12, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



a rudimentary robots.txt

Well, when they don't even ask for robots.txt, they don't know if it's rudimentary or not, do they :)