Forum Moderators: open

Message Too Old, No Replies

Who is Whizbang! Lab?

         

Gabriel

8:47 pm on Feb 26, 2000 (gmt 0)

10+ Year Member



My site was visited quite heavily by a spider I had not seen before, called WhizBang! Lab. Does anyone know whether that is a SE, and if so who?

Also, is there a fairly comprehensive list somewhere with the names of the spiders each SE use?

Thanks,

Gabriel

fantomaster

3:30 am on Feb 27, 2000 (gmt 0)

10+ Year Member



Whilst the UserAgent "Whizbang! Lab" is not in our
spiderSpy botbase, it might be helpful if you quoted
domain/IP as well.

As for spider ips, you can check Brett's extensive list here:
http://www.searchengineworld.com/spiders/spider_ips.htm

Gabriel

11:36 pm on Feb 28, 2000 (gmt 0)

10+ Year Member



The user agent was WhizBang! Lab, with the IP 216.250.143.102, and visited my site on 2/25/00 at 2:53 PM and hit 355 pages.

fantomaster

4:08 am on Feb 29, 2000 (gmt 0)

10+ Year Member



The UserAgent belongs to a software developed by WhizBang! Labs, Inc.

Their site is at: [whizbang.com...]

From their "About Us" file on site:

WhizBang! Labs has developed software that builds application-specific databases by automatically finding and extracting user-defined content from an unlimited number of Web pages located anywhere on the internet. The company's proprietary software:
Crawls the Web, searching for and identifying new domains
Classifies pages in each domain, identifying those that contain the user defined target data
Captures the target data, extracting it from the pages it has found and classified, whether that target data is embedded in the text or stored behind forms
Compiles the extracted data, storing it in a relational database where it can then be searched, sorted, filtered, and otherwise manipulated with traditional RDBMS tools, either directly or through a public or private portal

Seems they're selling their software.

The IP will not resolve. However, it is from an entirely different range than that of
www.whizbang.com (216.160.248.170).

As this spider does not service any search engine, we categorize it "DC" (decloaking hazard).
This means that if you are cloaking your site you should not feed it with cloaked or phantom
pages.

Hope this helps.

Gabriel

5:46 am on Feb 29, 2000 (gmt 0)

10+ Year Member



Thanks for the info, fantomaster. I suppose it was crawled by someone who purchased the software.

Did you find out that information by simply trying the whizbang.com web site or is there a more sophisticated way (other that whois)?

Thanks,

Gabriel

fantomaster

5:52 am on Feb 29, 2000 (gmt 0)

10+ Year Member



It's a mix of tools, various data sources and expertise: Whois, NSLookup, our own extensive dabases (literally growing by the hour), log file research all day and all night, lots of proprietary programs, spider traps, experience - it all adds up, I guess. If they creep upon you like Ninjas, you gotta grill 'em like Ninjas ...

Brett_Tabke

3:20 am on Mar 1, 2000 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Ditto on thanks for the info - I'd seen that spider before and never investigated. I get so many rogue spiders walking all over sites anymore that I rarely give them a second glance - mostly they just tick me off at flooding my logs with bascially bogus entries.

fantomaster

4:52 am on Mar 13, 2000 (gmt 0)

10+ Year Member



Not to panic, but ther's some disconcerting news re WhizBang:
A client of ours got spidered by both Infoseek and, some 10 hours later, by WhizBang. Here's a log excerpt:

ROBOT 2000-03-11, 20:56 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2.html -- -- InfoSeek Sidewinder/0.9
ROBOT 2000-03-11, 20:56 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2a.html -- -- InfoSeek Sidewinder/0.9
ROBOT 2000-03-11, 20:56 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2b.html -- -- InfoSeek Sidewinder/0.9
ROBOT 2000-03-11, 20:56 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2c.html -- -- InfoSeek Sidewinder/0.9
ROBOT 2000-03-11, 20:56 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2d.html -- -- InfoSeek Sidewinder/0.9
ROBOT 2000-03-11, 20:57 -- 204.162.96.124 -- 204.162.96.124 --
/dir2/page-2e.html -- -- InfoSeek Sidewinder/0.9
USER 2000-03-12, 06:55 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2a.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2b.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2e.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2ba.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2b.html -- -- WhizBang! Lab
USER 2000-03-12, 06:56 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2c.html -- -- WhizBang! Lab
USER 2000-03-12, 06:57 -- 216.250.143.106 -- 216.250.143.106 --
/dir2/page-2d.html -- -- WhizBang! Lab

And here's what he has to say about it:

"From its quick following of Infoseek Sidewinder (..)it sure looks like a checking bot. It is unlikely that WhizBang could have found these pages any other way. These are only days old, and have never been visited by anyone else. They were submitted by e-mail to Infoseek. There only links to these pages are a hallway page (which, because of Infoseek's recent policy of spidering only the home page, has a link on the home page.), which also was only submitted to Infoseek. (Had I known that IS would spider from e-mail so quickly, I wouldn't have bothered with the home page - hallway link.) It does not appear that WhizBang visited either the home page (index.htm) or the hallway page (which it could have only gotten from Infoseek anyway if it never hit the home page.) I will reexamine the logs going back a day or two & see if I missed a WhizBang visit at an earlier date where it might have picked up this link. At the moment, though, it looks VERY suspicious. Good thing Infoseek never updates their index - I think this site is screwed."

I'm inclined to agree that it looks rather fishy - anyone got similar logs to check out this issue?

Fishcatcher

6:04 am on Apr 20, 2000 (gmt 0)



Was reading about the suspicious nature of Whizbang following Infoseek and the posts stopped afterward. Anyone followed up on the correlation if any? Maybe it just follows somehow but isn't relaying any info back to INFO....?

Brett_Tabke

10:21 am on Apr 20, 2000 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



As Ralph can tell you, I've just been going through a stack of back logs for spiders (I keep spamming him with new finds). I can find zero hits from Whizbang (8million log hits).

theperlyking

11:04 pm on May 14, 2001 (gmt 0)

10+ Year Member



Wow bringing back an old thread here, just had a visit from the whizbang lab UA, very strange but specific crawling pattern - it crawled the "lite" version of one of my sites and ignored the standard version. Looked back in my logs and it doesnt seem hot on the heels of any other SE (i.e infoseek as mentioned below).

Mike Mackin in a similar thread noted the spider is used by Flipdog.com who claim to collate Job information, (see [flipdog.com...] ) though this doesnt seem a terribly efficient way to find jobs!

Brett_Tabke

11:20 pm on May 14, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



They build topic specific directories.

mivox

1:17 am on Dec 14, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Couldn't find any email contact info on their site to complain to... ANyone got a contact for ol' Whizbang? They ran through our site today requesting 3-6 pages at a time. Very obnoxious, and I suppose I'll just add them to htaccess Deny list for now....

wilderness

2:15 am on Dec 14, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>ANyone got a contact for ol' Whizbang? >
[whizbanglabs.com...]

Whiz sells software which can be configured.
[whizbanglabs.com...]
Contacting them a waste of your time and theirs unless you desire to purchase their software.
What you need to do is have a hit man eliminate the person using the software
or
just add the IP to your
htaccess ;-)