Welcome to WebmasterWorld Guest from 54.157.222.62

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Lightspeed Systems

sneaky creepy greedy crawler

   
11:20 am on Aug 7, 2008 (gmt 0)

10+ Year Member



Came by for a visit tonight and immediately got its feet stuck in my bot trap. Does not check for robots.txt, nor does it call itself a bot or crawler.

69.84.207.yyy - - [07/Aug/2008:05:18:51 -0400] "GET / HTTP/1.1" 200 3020 "-" "Mozilla/4.0 (compatible; MSIE 7.0;Windows NT 5.1;.NET CLR 1.1.4322;.NET CLR 2.0.50727;.NET CLR 3.0.04506.30)"
69.84.207.yyy - - [07/Aug/2008:05:18:52 -0400] "GET /blackhole HTTP/1.1" 301 260 "-" "Mozilla/4.0 (compatible; MSIE 7.0;Windows NT 5.1;.NET CLR 1.1.4322;.NET CLR 2.0.50727;.NET CLR 3.0.04506.30)"

However, when you visit the IP it states it is a crawler and it is performing a very important function downloading your entire web site... without your permission, of course.

"Because of this job we have to download and evaulate the content of every website on the Internet that children can reach. To keep an accurate database, we download and evaluate each website several times a year. We try to download web content without overly burdening any given web server.

This is not a hacking site, or a denial of service attack, or anything of that sort."

No, of course it isn't. Just a rude walk-through of my web site, and then take whatever you can get your grubby hands on. (Webmasters love that kind of stuff.)

Bot-trapped, banned, and kicked to the curb.

4:52 pm on Aug 7, 2008 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



They've been around a while.

2005:
66.17.15.yyy - - [28/Mar/2005:20:02:03 -0800] "GET /MyFolder/MyPage.html HTTP/1.1" 206 10097 "-" "Schmozilla/v9.14 Platinum"

2006:

66.17.15.zzz - - [06/Aug/2005:13:37:36 -0700] "GET / HTTP/1.1" 403 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50215)"

12:06 am on Aug 10, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Bot-trapped, banned, and kicked to the curb

When Lightspeed first attracted my attention a couple of years ago I did the same as you and looked them up - and found that a satirical YouTube spoof already existed (and it is still there today).

After giving it some thought I reckoned that that they probably analyze sites automatically by searching for "trigger" keywords etc, and that no human check is likely to be involved.

So I don't block them by IP and serve them a low-bandwidth "robots policy" file instead.

I don't really know if this works, but I now do it for all known content filters.

...

9:31 am on Aug 11, 2008 (gmt 0)

10+ Year Member



When any crawler, particularly one that is charging people for the privilege of looking at my widgets, comes tromping in like a bull in a China shop posing as a visitor and not a bot, and immediately gets nailed in a bot trap, I'm going to send them packing, no matter what noble cause they proclaim they're serving.

This crawler fits that definition perfectly.