homepage Welcome to WebmasterWorld Guest from 54.237.38.30
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
AllTheWeb
David

10+ Year Member



 
Msg#: 28 posted 2:57 am on Feb 23, 2000 (gmt 0)

Hi ... any discussion on how AllTheWeb works ? Since they have the largest index now.

Wonder how they review the submission ? Using Spider or Human Review ?

If Spider, do they have an IP ?

Thanks for sharing !

 

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28 posted 8:54 am on Mar 3, 2000 (gmt 0)

They have a full roaming spider just like most of the other SE's. Theirs is a pretty thourgh spider that picks up almost everything. Submissions are not reviewed.

David

10+ Year Member



 
Msg#: 28 posted 10:53 am on Mar 3, 2000 (gmt 0)

Wow ! Need to spend a lot of time tracking the spider.

Do you think Gateway page will work for this Search Engine ?

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28 posted 11:53 am on Mar 3, 2000 (gmt 0)

Yes, gateway pages can work, but since you can only submit your root url, you need to make sure you "direct" the spider to pages you want it to see - either by some cgi process or by including hidden links on your your homepage.

Air_

10+ Year Member



 
Msg#: 28 posted 12:43 am on Mar 4, 2000 (gmt 0)

Here are some IP's for Alltheweb (FAST)

User Agent: FAST (not the full user agent, but it will do for an inexact match)

209.67.247.15
209.67.244.78
209.67.244.79
209.67.247.153
209.67.247.156
209.67.247.201
209.67.247.207

As Brett said they gobble up everything, so make sure you have links to all the pages you want them to grab.

fantomaster

10+ Year Member



 
Msg#: 28 posted 3:44 am on Mar 4, 2000 (gmt 0)

Air_: Possibly detected a typo or two in your list:

209.67.244.78
won't resolve under nslookup
rwhois: Edu.com

209.67.244.79
won't resolve under nslookup
rwhois: Edu.com

209.67.247.15
won't resolve under nslookup
rwhois: Worldstreet

You sure you weren't going to write "247" for the first two?
(Though it beats me where that Worldstreet IP would fit in then.)
So our spiderScouts dept. may be wrong in which case we would be
glad to stand corrected: could you, if so, include some log file
data, please?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved