I don't think PFI is the major factor here.
It's more related to FAST's algo, which tries to establish a theme for a site, and to site collapsing.
You can set site collapsing on or off from the customize options at ATW.
Also: FAST often indexes first level pages of sites first, with lower level pages following.
<<Also: FAST often indexes first level pages of sites first, with lower level pages following.>>
Not true if you use PFI. They only include which pages you pay for - even if they crawl your entire site. I cannot help but think this flaws the relevancy of their results.
Paid pages get included fast and respidered each 24-48 hours. That's the deal.
Paid pages go into a seperate database.
This does not mean, that the rest of the pages of a site do not get into the main database.
Not that I haven't thought about that though. It's a danger inherent with introducing PFI. Naturally a SE offering PFI hopes to get as many paying customers as possble.
I think only the folks from FAST can really tell us what's the deal here.
I agree with <<Paid pages get included fast and respidered each 24-48 hours. That's the deal.>>
The sites that I have submitted for PFI only have the index page or pages paid for included. I am not complaining about this.
What I am saying is when searching for keywords phrases on Google I do very well on and of course are very relevant to the topic. Seldom is the index pages brought up in the top results - but deeper pages within the sites (includes all sites not just mine).
The same SERP results are not as relevant in FAST/INK/AV. I am "NOT" saying because my sites show/don't show that the results are bad (since this affects many sites in the SERPS).
Doesn't this bother anyone that so many sites and the relevancy of the SERPS in PFI SE's are not included/relevant for the users of these SE's?
I don't expect FAST/INK/AV SERPS to mirror Googles in any way shape or form - but the results cannot be as relevent when you take one page from a site instead of 60-400 or more pages depending on the site (it could be tens of thousands for some sites).
Maybe it's just me (since no one seems to be responding to this thread), but it still bothers me :)
>The sites that I have submitted for PFI only have the index page or pages paid for included. I am not complaining about this.
Oh, but you should ;)
At least if it's a good relevant site. I certainly would be alarmed if paying one page of a site would really mean, the rest stands no chance of getting indexed for free.
I do anyhow not believe this is the case for now. Some input on how FAST intends to handle this would be much appreciated.
we've had several discussions on this. I agree absolutely that Fast has a strong tendency to return the index page of a site, even if a deeper page should be more relevant to the query.
The first question should be: is it really true that only the index page is indexed?
Often, lots of pages are indexed, but the index page nevertheless is the only top ranked page.
Now that of course is a question of ranking algo.
In my experience, FAST shows deeper pages first in a serp, when the page receives the best inbound links relevant to that query.
On a large site I have lots and lots of top results for third level pages (all free, of course). I believe that's because those pages have very good links pointing directly to them.
If the index page is the only one indexed, I'd try to strengthen linkage to inner pages.
Also: a clear pattern I've seen is FAST indexes top levek pages first. Only then eventually deep pages get included.
Thanks for your reply. The terms for PFI in INK,FAST/LYCOS, clearly state that "only" the pages paid for will be included in the index.
I am going to drop the PFI when they expire and never include/recommend this route again ;)
One point that hasn't been mentioned about the difference between Fast and Google is the descriptions used in the SERPs.
One of the principle failings of Google is the nonsense it can produce in the description by pulling extracts from a page's text. For more experienced searchers, this may be a benefit since it allows you to see the context of your search results within the page.
However for less experienced searchers (aka Yahoo users), FAST's use of the meta description tag will often provide a more readable version. While this isn't perhaps as useful, it is definitely more newbie-friendly.
Maybe this is the more appropriate place to ask this question as I see the topic is comparing FAST vs Google:
posted in another thread:
Can anyone explain the results that were shown on page 2 of the above thread.
rzfree"s post on June 14 - 1:45PM regarding the search comparison for
"How to make pizza" on Google and FAST?
Why are people arguing on this board that FAST has more reliable results?
In the post from rzfree I can not see that it does.
Thanks for any input...
Seems that the FAST algo needs tweaking or their database needs more pages, to increase the relevancy of their results.
Could it be that they're trying to randomize their results against standard optimization techniques to weed out spammers, at the expense of relevance?
In any case, I think a page can be optimized for them, only a bit differently and it's only a matter of time when more pages are, if they get the Yahoo! contract.
One thing I noticed with FAST is that there are a lot more sites (at least in my niche) that are cloaking and not being penalized for it.
I really wonder about this idea that Fast relies on themes more than other engines. On my most important keyword phrase I rank #22. My site is themed, with high keyword density on the index page, and the same keyword phrase at the top of almost every other page in the site. In contrast, the #13 site only has the keyword phrase once, in the keyword meta tag, and not once in the content - not even on their other main pages that are linked from their home page. I have twenty-some inbound links, they have thirty-some. Makes no sense to me.
jrap, have you been able to de-cloak them?
I would find it very hard to belive that people cloak Fast more than they cloak other engines.
Either you cloak or you don't imo.
Linda, the example given was a phrase, something that gets handled differently in different engines.
As you see when performing the search for
How to make Pizza [alltheweb.com]
ATW rewrites the query to "how to" make pizza. This feature is dubbed search intelligence. It tries to discern certain patterns in complex queries.
While sometimes it brings good results it also has it's pitfalls.
|the addition of quotes around common phrases that are detected from the AllTheWeb phrase dictionary. Words in your query not relevant to the search will also be removed. |
Interestingly in this case turning off the search intelligence does not change the results.
So what's happening?
Without extensive testing I suppose ATW breaks the phrase down to three parts: "how to" and "make pizza" and "pizza" It then returns pages which rank well for either one of those parts.
Putting the phrase in quotation marks works much better in this case:
"how to make pizza" [alltheweb.com]
Two more things in general:
- if FAST should get the Yahoo deal a lot more collective intelligence will go in figuring out FAST's algo. We see that already starting in the threads popping up the last three days.
- FAST is not Google. A tendency all over this board, especially among people fresh to search engine promotion is to equate linkpop with PR, bad ranking with PR0 penaltiy etc.
Quite obviously Google, FAST, Altavista, even INK share some basic concepts for ranking pages.
But there are differences as well. Which is a good thing for the web.
>>Interestingly in this case turning off the search intelligence does not change the results<<
heini - I've been finding that the search intelligence gives more problems than benefits. On some sites I've been monitoring, for example, "San Francisco" is used as a modifier for some of the target searches. With search intelligence on (and "San Francisco" grouped together), many of the sites don't rank well. With search intelligence switched off, they're up in the top 10.
What's odd about this is that "San Francisco" is always used as a phrase on the sites... so I'm assuming that in addition to grouping words into phrases, search intelligence may be then applying competitive rankings to these phrases by themselves. Google results are also skewed, I've observed, by the presence of a common phrase, and I sometimes try to use this aspect of the algo when I'm well linked and I'm optimizing common terms for "peripheral" phrases.
But when the common term isn't really the target, but is rather a modifier, as in the case of a location name like "San Francisco" or an introduction phrase like "how to," the Fast algo can't really tell that the phrase isn't central... and with the how to make pizza search you see the results.
I think both engines can get fooled by this sort of thing... just in different ways. It's possible that Google has a list of "stop phrases," for example, analogous to stop words, that it deemphasizes. Just guessing on all of this, of course....
PS to the above... Re-checking some of the searches I've been monitoring, it looks like Fast is playing with the degree that search intelligence affects the results. It would be interesting to see how the pizza search changes over time.
Just did this search on Fast:
At least they got the dmoz index page in...hehe
Sigh... didn't it just maybe cross your mind to search for [alltheweb.com...] instead, and then press on "more hits from" dmoz.org ? No ? :)
Thanks for the tip Boaz.
No agenda, I just searched.
I'm glad to see they have 20% of the dmoz categories spidered, thats pretty comprehensive.
DMOZ : 1,088 sites - 49,676 editors - -49,601 categories
Fast Results : Displaying results 1-10 of 10,767 web pages found
I guess I consider ODP results a sort of "benchmark" for relevancy, nothing related to an "agenda".
Hmmm - not sure what you were trying to find. If you want to find all pages ATW has indexed from dmoz.org, you should use this search:
Displaying results 1-10 of 505,781 web pages found
I wasn't aware of the syntax for that search, I looked over the advanced search page and didn't see it.
|I guess I consider ODP results a sort of "benchmark" for relevancy, nothing related to an "agenda". |
ODP is used for categorizing results at Fast. It's what you see in "FAST TOPICS". They should have most of ODP crawled, as heini points out.