Welcome to WebmasterWorld Guest from 184.108.40.206
Forum Moderators: open
The New York Times is running an article about Clusty, a start up search engine that "clusters" like websites together and then uses that to display relevant information to the searcher.
We all know Google uses keywords, but with the size of today's internet keywords are not the best method.
"As databases get larger, trying to pull the proverbial needle out of the haystack gets tougher and tougher...
A would imagine a good amount of people search very general terms, for our example: "Skydiving Overseas".
A Google search would look for that keyword phase. However this is detrimental for the searcher because it is ignoring sites in the same category but that use a different phase:
"Skydiving in Mexico"
Experienced websites have developed an improvised method of getting these keywords via SEO:
<Title> International Skydiving Abroad and Overseas in Mexico </Title>
But VERY few websites do this, certainly less than 5 percent.
A clustering website would (in theory) be able to display all these sites perfectly, and I bet Google knows it.
We know that Google's frontend is at an all time low in activity (not saying it isnt active, just less active), and we know the Google's backend is working on SOMETHING like crazy (some suggest a complete rebuild of the google database).
Might I now propose a theory:
1. Many internet search experts say "clustering" is a better way to organize today's massive internet.
2. Many new clustering search engines are coming out, but currently do not have nearly the power of google in term of actually crawling the net for sites (this was admitted by Clusty in the article).
3. Google is about to face it's biggest challenge yet from Microsoft.
4. Search engine users are fickle, whomever provides the better results wins. (this is how google won the first war).
5. Google's backend is very busy while Google's frontend is very static.
Google is moving from a keyword based system to a cluster base system, hoping to beat the smaller companies by using their advantage on the "crawling gap" to provide a cluster base system FAR larger than these new companies could ever hope to produce.
After years of trying to rework the algo to banish "crap-pages" and promote "great pages", each time being thwarted by SEOers (black hats mainly), Google has started on something brand new, new algo, new theory, and perhap even a new deliver method (gbowswer?).
This new method would be able to provide better results (an example of this was provided above) and stop (atleast temporarily) "Crap-pages" by being less reliant on keywords.
Obviously I do not know specifically how they plan on doing this, but they have big brains... I'm sure they have a way.
This is pure speculation, but it is certainly has a possibility.
[edited by: Livenomadic at 3:03 pm (utc) on Oct. 3, 2004]
Is this not what Semantics was going to achieve?
I wouldn't have said clustering and LSI were the same thing. Clustering is grouping similar websites together to create almost a "category" from which Google will then select the most appropriate website for that particular search term.
LSI is directly related to the page/site on it's own and the content of it, not who it links to or who links to it.
That is my very basic interpretation of clustering and LSI but is it wrong?
so its not about a single page/site, its about all pages for that topic.