homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Any Body?
What is this dpcatalog1.go2net.com

 12:54 am on Mar 20, 2001 (gmt 0)

Hey LM.

Any ideas? Just grabbed the root page.. 9 different domains and both new submits.
Tracing route to dpcatalog1.go2net.com []

Spider or not?



 5:19 am on Mar 20, 2001 (gmt 0)

It's looking lika yes -


oh well, I am sure better ones will come along ;)


 6:39 am on Mar 20, 2001 (gmt 0)

I am very interested to see what Thunderstone's future plans are... they have a very interesting combo directory/index system: they spider, they will (eventually) place your site in the category their software finds most appropriate, but you can give them category and description suggestions for your listing, and they will re-place your category accordingly.

Considering the number of 'come-from-nowhere' indexes and directories the web's seen, I plan on keeping track of this one.


 6:49 am on Mar 20, 2001 (gmt 0)

Timely. I am in the middle of updating a search engine history file and I'm trying to figure out what all Go2net owns.

Metacrawler, Dogpile, Infospace, Others?

I don't think Thunderstone is owned by go2net?


 10:02 pm on Mar 20, 2001 (gmt 0)

From Thunderstone's 'about us' page:

"Thunderstone is an independent R&D company that has been providing high-performance state-of-the-art solutions to intelligent information retrieval and management problems for over 19 years. Our flagship product, Texis, is the most comprehensive text retrieval and publishing software available. In one package Texis provides every full-text, SQL, multimedia management, and dynamic publishing operation needed for an enterprise search application."

Doesn't sound like anyone owns them... strange, isn't it? ;)


 10:55 pm on Mar 20, 2001 (gmt 0)

What was the Agent name DrBill?


 1:12 am on Mar 21, 2001 (gmt 0)

Here's the relevant info:

From logfiles:

www.[edited].com [20/Jan/2001:19:38:16 -0500] "/robots.txt" 404 - "-" "Mozilla/2.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E)"

Then notice in the link below dogpile is using texis which is the technology described on the thunderstone site.


So I guess texis comes with a spider and the dogpile catalog is using it, or t-stone is spidering for them.


 10:17 am on Mar 21, 2001 (gmt 0)

That is wild.


 7:12 pm on Mar 21, 2001 (gmt 0)

I've seen 'texis' in other search sites' file paths... So I'd say Thunderstone is licensing out their indexing/search technology like Inktomi does. Don't know if they're running the spiders for their slients or not though.

But Thunderstone itself does maintain their own index/search site. You *can't submit your site to it*... They spider, and if they happen to find you (and like you), you're in. Their description also says they're concentrating on 'quality not quantity' with the material they index, and they seem to rely almost entirely ont heir spiders and software to manage it. Don't know how they managed to build a spider they could trust to judge 'quality'...


 6:11 pm on Apr 13, 2001 (gmt 0)

Speaking of Thunderstone, I have been trying to figure out how they find sites to spider.

I have seen them hit many brand new sites that have no other sites linking to them... And, it began immediately after I submitted to several other engines.

Any ideas? Can anyone see any patterns in their logs?

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved