The site is themed so there is cross-linking across levels and linking down the tree to pages below that are related. I thought this would be great for spiders like FAST but there's a big problem right now. FAST is coming through and spidering my pages that have links like: <a href="/products/typeofproduct/specificproduct/">Specific Product</a> But instead of following the link he requests: /products/specificproduct/ Why GOD??? Is anyone else seeing this. He's relentless in doing this even though none of the links are relative in terms of using ../blah or ./blah ... all are served up relative to the root directory with their full path (but not [domain.com...] I guess my beautiful pages won't be listed in FASTs index on the next update ... but what can I do to avoid this happening in the future?
Well, FAST seems to be crawling those sorts of links OK (i.e. the artist pages). However, when it comes to the pages below those such as: [neartexpress.com ] [which as you can see is linked right off of that page .. no funny stuff] FAST will not crawl them but instead requests /fine_art/time_well_spent/ which gives a 404 because there is no such page. I have no idea how WebCrawler gets the idea in it's head to do that .. but maybe there are HTML errors that I didn't notice as Brett suggests (thanks for the hint)