homepage Welcome to WebmasterWorld Guest from 174.129.130.202
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
Forum Library, Charter, Moderator: open

Deprecated - Altavista, Alltheweb.com Forum

    
ATW Begins Roll Out of PDF Crawl and Access
rubble88




msg:221009
 2:18 am on May 17, 2002 (gmt 0)

The Virtual Acquisition Shelf and News Desk Weblog is reporting that ATW has begun to crawl and provide access to .pdf material.

[resourceshelf.freepint.com...] (5/16/02 Posting)

 

Rumbas




msg:221010
 7:23 pm on May 19, 2002 (gmt 0)

You are absolutely right rubble88!
We knew it would come one of these days and what a pleasure that it finally did. I can't find anything on ATW or FAST' sites, so, from Freepint:

Here’s a search for the terms librarian AND database [alltheweb.com] that was constructed using the advanced search page and filtering the term .pdf in the url. You can also limit by using the syntax, url.all:pdf , in the any search box.

Nice move Fast. Let's have more of that :)

lazerzubb




msg:221011
 12:26 pm on May 20, 2002 (gmt 0)

They have done it for quite some while if i am not wrong, it's just that they used scirus for viewing it.
I talked to Stephen Baker from FAST for some weeks ago, and he said that they would start to include the spidered .PDF files in the regular search results too.
AllTheWeb will indexing more formats in Q3 and Q4, for Q2 they focus on building a bigger index.

Winooski




msg:221012
 5:13 pm on May 20, 2002 (gmt 0)

I wonder whether FAST is able to spider PDFs, or is it a submissions-only type of thing? I tried locating some "random" PDF files that were in Google's results, and they didn't show in FAST (using the "must have .PDF in the URL" setting).

lazerzubb




msg:221013
 5:16 pm on May 20, 2002 (gmt 0)

FAST can spider .pdf files, they can spiders most files, they just don't include them in there index.
If many people who used Lycos wanted to be able to search word documents, and Lycos would like FAST to include word documents in there Index, they would probably include them pretty quickly.
It's all about what the customer wants.

Jaze




msg:221014
 4:52 am on May 22, 2002 (gmt 0)

I think I can add to another to the list of improvement to FAST's index, it seems that AllTheWeb is now capable of recognising that two URL's point to the same site. For some time we had the same site listed under two different URL's (.co.nz and .com) but this seems to have been resolved. (maybe it's not new?)

It seems this happened in the last month or two (Mar 2002)?

(edited by: Jaze at 5:10 am (utc) on May 22, 2002)

EliteWeb




msg:221015
 4:54 am on May 22, 2002 (gmt 0)

Nice to see another engine adding PDF support. Does it have the option to view as HTML though? Sometimes I'd rather have that than loading a huge program and crashing my computer (poor thing)

Rumbas




msg:221016
 9:26 am on May 22, 2002 (gmt 0)

Jaze, I've seen similar glitches where ATW showed both a .com and a .com.br when checking with url.host:

Seem to have been fixed to some extent now.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved