homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Another Weird Spider?

 7:45 pm on Oct 26, 2000 (gmt 0) - - [25/Oct/2000:18:53:49 -0600] "GET /robots.txt HTTP/1.0" 200 649 "-" "Bjaaland/0.5 ODP-stats (bjaaland@antarcti.ca) libwww-perl/5.44"

Which SE does this spider belong to?



 5:15 am on Oct 27, 2000 (gmt 0)

Here is the NS lookup
Name: lb1.antarcti.ca
We were just discussing the graphics on their web site [antarcti.ca].

Bjaaland/0.5 ODP-stats (bjaaland@antarcti.ca) libwww-perl/5.44

It is a LWP perl bot. But does anyone know if this is the (or one of the) DMOZ link verification spiders?


 5:48 am on Oct 27, 2000 (gmt 0)

Absolutely right, littleman. [info.webcrawler.com]


 6:34 am on Oct 27, 2000 (gmt 0)

Thanks Bartek,
I thought it was, but what made me do a double take was the place it is coming out of.


 11:30 am on Oct 27, 2000 (gmt 0)

Has anyone else been visited by Robozilla/1.0

It may be another DMOZ link verification spider. Only my indexed page was visited and nothing else.

Server: h-206-222-248-44.netscape.com
Referrer: "http://directory.mozilla.org"
UA: Robozilla/1.0

also visited with the same referring URL and User-agent.


 12:14 pm on Mar 31, 2001 (gmt 0)

I just got hit with over 350 attempts at bogus pages and directories within minutes by this guy ( ). What a pain! I got 350+ emails with telling me someone got 404 errors on my site. What a mess!


 2:47 am on May 11, 2001 (gmt 0)
Bjaaland/0.6 (bjaaland@antarcti.ca)

At first I thought it was checking for links, for it took every document with a GET request, and then every image on the site with a HEAD request.

But then I went to antarcti.ca, and went to the demo here:

Basically it is a visual ODP. Quite impressive I think. But you'll need to be using something faster than 56K unless you're really patient.

You can do it 2D or 3D. Really cool.


 7:35 pm on May 11, 2001 (gmt 0)

These guys hit my site pretty hard, and due to my configuration problem (with my log program), I didn't see their robots.txt request until *after* I emailed them compaining loudly about their bad manners... how dare they not request robots.txt, and therefore send their spider barreling through directories it had no business in, etc., etc.

So I got an email back from a tech support guy, who (after I realized my error and apologized) took the time to test and re-test my robots.txt with me, until it was working properly. Their customer response and service for their internet spidering is absolutely impeccable.

Although I think the actual major thrust of the corporation is to develop search/indexing solutions for large corporate intranets...


 1:22 pm on May 14, 2001 (gmt 0)

Robozilla is the test spider for DMOZ. It goes round seeing if the site is a 404 or not.


 5:39 pm on May 14, 2001 (gmt 0)

The Bjaaland one is just a link verification spider for antarcti. Why they don't just download the rdf dump again is a mystery...


 6:05 pm on May 14, 2001 (gmt 0)

no, I don't think Bjaaland is *only* link verification. It spidered my *entire* site on it's first run through.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved