homepage Welcome to WebmasterWorld Guest from 54.237.184.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Any Body?
What is this dpcatalog1.go2net.com
drbill




msg:399450
 12:54 am on Mar 20, 2001 (gmt 0)

Hey LM.

Any ideas? Just grabbed the root page.. 9 different domains and both new submits.
**********************
tracert 63.251.4.43
Tracing route to dpcatalog1.go2net.com [63.251.4.43]
********************************************

Spider or not?

 

Air




msg:399451
 5:19 am on Mar 20, 2001 (gmt 0)

It's looking lika yes -

T-H-U-N-D-E-R-S-T-O-N-E

oh well, I am sure better ones will come along ;)

mivox




msg:399452
 6:39 am on Mar 20, 2001 (gmt 0)

I am very interested to see what Thunderstone's future plans are... they have a very interesting combo directory/index system: they spider, they will (eventually) place your site in the category their software finds most appropriate, but you can give them category and description suggestions for your listing, and they will re-place your category accordingly.

Considering the number of 'come-from-nowhere' indexes and directories the web's seen, I plan on keeping track of this one.

Brett_Tabke




msg:399453
 6:49 am on Mar 20, 2001 (gmt 0)

Timely. I am in the middle of updating a search engine history file and I'm trying to figure out what all Go2net owns.

Metacrawler, Dogpile, Infospace, Others?

I don't think Thunderstone is owned by go2net?

mivox




msg:399454
 10:02 pm on Mar 20, 2001 (gmt 0)

From Thunderstone's 'about us' page:

"Thunderstone is an independent R&D company that has been providing high-performance state-of-the-art solutions to intelligent information retrieval and management problems for over 19 years. Our flagship product, Texis, is the most comprehensive text retrieval and publishing software available. In one package Texis provides every full-text, SQL, multimedia management, and dynamic publishing operation needed for an enterprise search application."

Doesn't sound like anyone owns them... strange, isn't it? ;)

Brett_Tabke




msg:399455
 10:55 pm on Mar 20, 2001 (gmt 0)

What was the Agent name DrBill?

Air




msg:399456
 1:12 am on Mar 21, 2001 (gmt 0)

Here's the relevant info:

From logfiles:

www.[edited].com 63.251.4.43 [20/Jan/2001:19:38:16 -0500] "/robots.txt" 404 - "-" "Mozilla/2.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E)"

Then notice in the link below dogpile is using texis which is the technology described on the thunderstone site.

[dpcatalog.dogpile.com...]

So I guess texis comes with a spider and the dogpile catalog is using it, or t-stone is spidering for them.

Brett_Tabke




msg:399457
 10:17 am on Mar 21, 2001 (gmt 0)

That is wild.

mivox




msg:399458
 7:12 pm on Mar 21, 2001 (gmt 0)

I've seen 'texis' in other search sites' file paths... So I'd say Thunderstone is licensing out their indexing/search technology like Inktomi does. Don't know if they're running the spiders for their slients or not though.

But Thunderstone itself does maintain their own index/search site. You *can't submit your site to it*... They spider, and if they happen to find you (and like you), you're in. Their description also says they're concentrating on 'quality not quantity' with the material they index, and they seem to rely almost entirely ont heir spiders and software to manage it. Don't know how they managed to build a spider they could trust to judge 'quality'...

blue2




msg:399459
 6:11 pm on Apr 13, 2001 (gmt 0)

Speaking of Thunderstone, I have been trying to figure out how they find sites to spider.

I have seen them hit many brand new sites that have no other sites linking to them... And, it began immediately after I submitted to several other engines.

Any ideas? Can anyone see any patterns in their logs?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved