Forum Moderators: open

Message Too Old, No Replies

Block and Complain success

Sometimes they do clean up their act

         

jdMorgan

5:45 am on May 20, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well,

It seems that some companies DO pay attention if you block and complain...

-----
Saw this:

64.210.196.195 - - [18/May/2002:18:41:43 -0400] "GET /robots.txt HTTP/1.0" 200 855 "-" "-"
64.210.196.195 - - [18/May/2002:18:41:43 -0400] "GET / HTTP/1.0" 200 58768 "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:49 -0400] "GET /images/l_quuxco.gif HTTP/1.0" 200 3615 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:49 -0400] "GET /images/bullet.gif HTTP/1.0" 200 121 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:49 -0400] "GET /images/map_quux.gif HTTP/1.0" 200 6632 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:49 -0400] "GET /images/l_foo.gif HTTP/1.0" 200 4615 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:49 -0400] "GET /images/l_quux_s.gif HTTP/1.0" 200 1016 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:50 -0400] "GET /images/map_foo.gif HTTP/1.0" 200 18930 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:50 -0400] "GET /images/usequux.jpg HTTP/1.0" 200 6204 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:50 -0400] "GET /images/l_foo_s.jpg HTTP/1.0" 200 4043 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:50 -0400] "GET /images/get_ns.gif HTTP/1.0" 200 971 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"
64.210.196.195 - - [18/May/2002:18:41:50 -0400] "GET /images/get_ie.gif HTTP/1.0" 200 970 "http://www.quux-co.com/" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"

-----
Sent this:
Your robot appears to be spidering our site using a generic User-agent string which contains no contact information. This is a violation of web etiquette, and our policy is to ban such robots. Until such time as your robot properly identifies itself, your IP block is forbidden to access our site. - Sorry, please follow the rules.

J.D. Morgan

-----
Received this:
From Yuval Y Sun May 19 07:15:59 2002

Dear JD Morgan,

Thank you for your feedback.

We have added contact info to the User-Agent string sent by our robot.

Sincerely
Yuval Y

-----

If they've actually followed through, I'll have to give them extra credit for responsiveness.

So, now I'm waiting to see if they come back with a valid UA and contact info... Anybody been spidered by this IP block since 19 May, 2002 07:16:19 -0700 (PDT) ?

Thanks,

Jim

wilderness

12:09 am on May 21, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



64.210.196.195 - - [10/May/2002:01:38:03 -0700] "GET /robots.txt HTTP/1.0" 403 - "-" "-"
64.210.196.195 - - [10/May/2002:01:38:04 -0700] "GET / HTTP/1.0" 403 - "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT)"

This visitor was denied access as a result of ("-" "-") on the first line.

Denied on the second line for a Global Crossing block which I've had repeated problems with.

They cannot have very good intentions if they first atttempt visits minus a referrer and UA and then later without a contact information. :-(

jdMorgan

7:54 pm on May 22, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



wilderness,

These guys are a "service" similar to, but a bit more useful than, webcollage.

They show a thumbnail of web pages matching a search, similar to the Windoze thumbnail view of a folder.

As such, I'll allow them as long as I get contact info in the UA. They won't do me all that much good vis-a-vis visitors, but my personal policy is to only ban e-mail harvesters, true bandwidth hogs and server-overloaders, and those who do not honor robots.txt - or worse yet, look at it and then try to load only the disallowed files!

Judging by their very-quick response, I suspect they really didn't know, or didn't realize what a backlash the empty or generic UA might cause. I was really surprised to hear back from them, and I'm watching my logs to see if they followed through with valid UA contact info. If not, I'll post here again.

Have you posted in a thread here with details about your global crossing ban? If not, please expand on your comment!

Thanks,
Jim

wilderness

3:27 am on May 24, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>Have you posted in a thread here with details about your global crossing ban? If not, please expand on your comment!>

Hey Jim,
Been out of town for three days. Need a laptop with a digital phone and a bag of money :-)

I've explained portions of this previously. Sorry I can't tell you under what thread.

It took me a year to get a requested listing for my main website (which is about a specific breed of horses.) When the About listing finally arrived it was framed. The About horse site is primarily another breed although it has a few short pages of the breed of my interest.
As a result of the frame I asked About to remove my listing immediately. It was somewhat done. After a time (late September of last year) I just denied the About bot access. That bot automatically attempted most every known file extension to my main page and continues to do so some 8 months later more than 5-6 times weekly.
Almost within moments of my first About deny visit, I was besieged with visits from Global Crossing, Thunderstone and a few others which prompted my chasing down some extensive backbone searches. In the end I stopped at Yahoo.
Additionally I have some larbin bots which come from Global IP's also.

To help folks understand my over-bearing nature?
Since my participation began on the web in 1996 my activities remain concentrated primarily with this horse breed in the foreground. With every other activity related to enhancing the primary activity.
I started with participation in an email subscription list about the breed. Accumulated an extensive IP list from emails. Created three mail lists of my own. Added two websites. All with the primary focus of the breed. This allows me to compare multiple data sources when analyzing a website visitors activity. In the process making a decision of whether the visitor is a legitimate visitor or a (malicious visitor [mail grabber, commercial research, a website builder with copy intent].)

I'm NOT proud to say that I likley have more IP blocks denied access than most any website. However it is necessary as I'm granted use of copyright materials which I agreed to protect. In addition the majority of the horse people have very little spare time. Becoming discouraged easily of some of the maliciousness (spam and such) of the internet because they are not internet or computer based people. As a result I protect a very very narrow market.

In most instances I don't take advantge of other websites and I expect the same in return. I demand it and enforce it.

I've attempted to explain this somewhat vaguely as I'm not sure this restiction by myself should be implemented across the board by other webmasters. Nor do I wish to encourage people to follow the direction I have chosen.
The end analysis amounts to what I have to determined to be the best approach for my websites and the narrow market I serve.
In the end most everybody should determine the plan and direction of their own website (s.)

If you should have any further questions concerning this?
I'll be gald to answer them privately.

jdMorgan

5:31 am on May 24, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



wilderness,

Thanks for the reply. I've also been mostly-offline, as some recent weather knocked my satellite link down, and I'm using a SLOOOW analog modem until I get the dish readjusted. 24kbps does serve as a good reminder to keep my pages slim and trim, though. :)

Like EliteWeb's, our site got close to its bandwidth limit recently due to some rather rude activity by some site-leeches, so the goal of blocking them, and doing it efficiently, became fairly critical. I didn't want to have to shut the site down at the end of the month to avoid going over our bandwidth limit, like EW was talking about doing!

You seem to have a "rep" as a "big-time blocker", and as such, I feel I've just had some rather expert advice... So Thanks!

Jim