Forum Moderators: open

Message Too Old, No Replies

Help needed identifying some kind of automated crawl from 66.122.16.

         

bobmark

8:45 pm on Feb 15, 2003 (gmt 0)

10+ Year Member



I would imagine there must be something somewhere on here on this one but I searched diligently for about 1/2 hour with no luck.
Anyone know what this is or a thread detailing it?
66.122.16.207 - - ... "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

The thing hits a ton of pages and follows every link and is automated looking for something.

thanks

Dreamquick

9:12 pm on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the looks of it you have a home-made bot operating from an ADSL connection...

adsl-66-122-16-207.dsl.scrm01.pacbell.net is the name attached to that IP, and obvious the IP in question belongs to Pac(ific?) Bell - there is no website currently on the other end of that IP so no clues there.

Its probably someone's pet project or similar, as I severely doubt it's a "professional" crawler as it didn't follow etiquette *and* didn't try very hard to disguise itself...

A site search and even a google search returned zilch which suggests that it's a fairly new thing - otherwise others would have seen it and/or it would appear in some of the public site logs google spiders unless it was ultra niche.

Sorry I can't offer much more help than that.

- Tony

bobmark

9:17 pm on Feb 15, 2003 (gmt 0)

10+ Year Member



Thanks, Tony,
I banned it just now as it took a ridiculous number of pages in a short time and repetitively followed every link there was to follow on page after page.
Mark

wilderness

11:02 pm on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



bobmark,
I've had the following PacBell ranges denied since Feb. of 2002.
For me to go to this extent I must have been one upset cookie :-(
These intrusions were the result of CGI Form access attempts howver mostly Microsoft URL Control.

deny from 206.13.
deny from 206.170.
deny from 206.171.
deny from 63.192.
deny from 63.194.
deny from 63.195.
deny from 63.196.
deny from 63.197.
deny from 63.198.
deny from 63.199.
deny from 63.200.
deny from 63.201.
deny from 63.202.
deny from 63.203.
deny from 63.204.
deny from 63.205.
deny from 63.206.
deny from 63.207.
deny from 64.160.
deny from 64.161.
deny from 64.162.
deny from 64.163.
deny from 64.164.
deny from 64.165.
deny from 64.166.
deny from 64.167.
deny from 64.168.
deny from 64.169.
deny from 64.170.
deny from 64.171.
deny from 64.172.
deny from 64.173.
deny from 64.174.
deny from 64.175.
deny from 66.120.
deny from 66.121.
deny from 66.122.
deny from 66.123.
deny from 66.124.
deny from 66.125.
deny from 66.126.
deny from 66.127.
deny from 66.130.

Since I now have the URL control in my SetEnv I should be able to go back and reduce these settings.
Many thanks
Don

wilderness

11:25 pm on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



bobmark,
I sent three reports to Pac Bell. The result from Pac Bell was three automated responses on 1/05, 2/15 and 2/22.
There was a manual response informing me on 1/17/02 that faked CGI attempts were a violation of the Pac Bell TOS (which I included a link to in my original inquiry,) Pac Bell was kind enough (TIC) to provide the same URL right back. Which showed me both how serious and what a thorough job Pac Bell did of reading my mail inquiries.

The mail did say "I will investigate your complaint and take appropriate action."
I just wasn't advised of the action :(

Don

bobmark

3:25 pm on Feb 16, 2003 (gmt 0)

10+ Year Member



Thanks, wilderness,
I shot an email off to pacbel too, not that I expect much response. I would think the only hope is if its a "home account" that someone is using for 24/7 automated crawling which might irritate them, at least to the degree that the person is not paying their corporate rate.
The worst part of this thing from my point of view is the following of every link on every page - I have translation into 6 languages on my pages, which means not only did it GET each page, it translated each page 6 times, all as fast as the server could feed it.

JuniorHarris

7:21 pm on Feb 23, 2003 (gmt 0)

10+ Year Member



I have a number of pacBell ranges blocked as well....too many to go find and list right now. But I did want to share the fact I have been blocking a number of these now. I think some are using permament addresses, so if you can stop them once, you'll get them for good.

Initially I would run historical reports to check for valid traffic from the offending addresses, but now all I need to know is pacBell. Mamma Bell would not be proud!~

wilderness

8:57 pm on Feb 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



jrharris,
IMO all od the bell affiliates should be embarrased. SBIC is no different and neither is SWBell.

Their greed has but one goal. Bandwidth with no responsibility :(