homepage Welcome to WebmasterWorld Guest from 54.205.105.23
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
Forum Library, Charter, Moderator: open

Deprecated - Altavista, Alltheweb.com Forum

    
FAST-WebCrawler ... why are you so dumb?
Maybe that's harsh ... but this is pretty weird.
ArtSEPI

10+ Year Member



 
Msg#: 281 posted 5:31 pm on Jul 19, 2001 (gmt 0)

I have a site set up in the following way, which should be pretty standard:

www.domain.com/index.html
www.domain.com/products/index.html
www.domain.com/products/typeofproduct/index.html
www.domain.com/products/typeofproduct/specificproduct/index.html

The site is themed so there is cross-linking across levels and linking down the tree to pages below that are related. I thought this would be great for spiders like FAST but there's a big problem right now. FAST is coming through and spidering my pages that have links like:
<a href="/products/typeofproduct/specificproduct/">Specific Product</a>
But instead of following the link he requests:
/products/specificproduct/
Why GOD??? Is anyone else seeing this. He's relentless in doing this even though none of the links are relative in terms of using ../blah or ./blah ... all are served up relative to the root directory with their full path (but not [domain.com...] I guess my beautiful pages won't be listed in FASTs index on the next update ... but what can I do to avoid this happening in the future?

 

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 281 posted 6:26 pm on Jul 19, 2001 (gmt 0)

Hmm. I've not seen that. I have only seen that on a page where I had some other html errors.

roscoepico

10+ Year Member



 
Msg#: 281 posted 6:34 pm on Jul 19, 2001 (gmt 0)

[neartexpress.com...]

Is the type of link you're refering to from your site in the profile? If so, I am a little confused as to what the problem is. Can you exlain a little further?

ArtSEPI

10+ Year Member



 
Msg#: 281 posted 7:13 pm on Jul 19, 2001 (gmt 0)

Well, FAST seems to be crawling those sorts of links OK (i.e. the artist pages). However, when it comes to the pages below those such as:
[neartexpress.com ]
[which as you can see is linked right off of that page .. no funny stuff]
FAST will not crawl them but instead requests /fine_art/time_well_spent/ which gives a 404 because there is no such page. I have no idea how WebCrawler gets the idea in it's head to do that .. but maybe there are HTML errors that I didn't notice as Brett suggests (thanks for the hint)

ArtSEPI

10+ Year Member



 
Msg#: 281 posted 9:09 pm on Jul 19, 2001 (gmt 0)

FAST has been crawling today and hitting a few pages. Now it seems to be hitting some of the pages I discussed before OK! Hopefully all will be well :)

(BTW, no offense meant in the title FAST, it's probably my fault!)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved