|My search engine idea vs Google's|
In search, relevance is everything – Duh! Given that Google’s market share has steadily increased over the past couple of years, (In July 60% of all searches were done through Google) it’s easy to assume that for now they offer the most relevant search solution.
Currently, Google’s ultimate goal is to find the most relevant answer for every possible search query. You can imagine the vast amounts of information they cull through to accomplish this task. We’re talking about trillions of possible queries, each of which must produce relevant results within seconds.
In fact, there are probably hundreds of factors that go into how Google determines relevancy for any given keyword. However, even though their search algorithm is very complex, it still only provides a one-size-fits-all level of relevancy.
Think about it, whether you are an 85 year old conservative grandmother or an 11 year punk rock skater kid - when you type a keyword into Google the same set of results will appear (barring which data center you reach – but let’s assume that the exact same data center is used). The Apparent flaw in this method of structuring results is that what’s relevant to an 85 year old grandmother might not be relevant to an 11 year old kid.
To correct this shortcoming we must step back from the current model of generating search results with computer algorithms, and focus more on human elements. Humans can be categorized in varying levels of complexity. From the broad, easily identifiable variables like sex, religion and income bracket to the less identifiable personality variables like whether someone is extroverted or introverted, whether they like cats, or whether they are afraid of scary movies.
Although there are millions of potential variables, identifying core variables and grouping them together allows us to model probable search results for any given persona - hence the term Persona Rank (I know… Yet another potential “PR” search related acronym – sorry, I couldn’t help it).
One foreseeable downside was that users simply want to be able to go to a site, do a search and get results. They don’t necessarily want to have to waste time filling out a personality profile, before a search utility works for them. The solution we found was to monitor personal search patterns in an unobtrusive manor so that we can gather the data from each user without having to ask each user to divulge that in-depth, personal information.
While this sounds simple in theory, it actually involves a rather complex set of data. Each persona contains a number of predefined characteristics, and each characteristic contains a varying level of intensity depending on the individual user’s preferences. In addition, any user may have multiple personas.
This is only the beginning of the complexity however. Once we attach personas to each user, we then must address the issue of ranking search results for each persona. For this we use what I call Reputation Scoring.
Instead of relying heavily on PageRank, as Google does, we rely on the Reputation Score of our users to determine which pages should rank well for each persona. For example, if an 18 year football jock persona types in the keyword “movie review” he would see a completely different set of search results than a 45 year old mother of 6.
By moving search relevance away from the current page based scoring system, we can present constantly evolving personalized search results for each individual user.
So, what do you think? Do I have a chance in the world of beating Google?
In any case, wish me luck. I’m going to need it.
I sure would hate to be stereo-typed and have search results catered to me just because I am of a certain age or ethnicity or participate in specific interest. I'm a 33 year old white male that did a search on stock, are you going to return results to me about investing? no im too poor.. lock stock and two smoking barrels? no, i don't live in england....livestock? i don't live in montana... woodstock? no...im too young...
But what If i wanted one of those.. I want to learn about branding cows, well I won't because your search engine thinks Im looking for hot stock tips...
And whats going to happen is that in order to get results that Im truely looking for, I'm going to have to have a long tail as a search term, which then is a departure from your search model of profiling since I will be looking for specific results instead of inaccurate results
>The Apparent flaw in this method of structuring results is that what’s relevant to an 85 year old grandmother might not be relevant to an 11 year old kid.
The danger is that you replace this flaw with a different one. Namely, your assumptions of what an 85 year old grandmother might want to see. The grandmother may be searching for a christmas present for an 11 year old.... and the poor 11 year old ends up getting a cd of 'Bing Crosbies greatest hits'.
Making assumptions is very dangerous. I like Eminem but also Zen art, I'm 45. Rather than making assumptions, the trick may be NOT to have to make assumptions. This could be achieved by forcing the searcher to be more precise. Long tail searches narrows down the options. An interface that quickly forces them to give more information could be more effective. Google's simplicity has been its strength but also its weakness. Searchers tend to be too vague in their use of keywords, thus they get vague results. A more complex search interface may turn people away, but those that stick with it will probably get better results. As joe public become better at searching they appear to be more willing to spend a bit of effort in selecting their Keywords and setting up a search criteria. Rather than making assumptions based on age, sex, etc., giving a few optional drop downs and thus giving the searcher more control could be better than taking those options away and making decisions for them.
The trick of giving them alternative searches after the first search is a good one. For instance, people are generally dumb and use vague keywords for their first search. They will search for 'widgets' when in fact they know they want 'blue widgets' but for some bizarre reason they do not specify this, thinking a search engine will realise. At the top of the search results, giving alternative searches based on broad match, user's search history or perhaps the users ip location could be effective. In otherwords, offering search options is probably better than making initial assumptions which could be way off beam.
Google succeeded through word of mouth, because it gave people a better result than er, altavista (?).
People will not change their Google habit because yours is different - it has to be better.
You'll get some curiosity visitors, but who wants to fillin a questionnaire about their age, favorite color and such before searching? No-one over the age of 18 for absolutely sure, and few over 14.
Gimmicks are fun - but Google is here for the long haul. Are you?
If you want Google's crown, then "Build A Better Mousetrap - And The World Will Beat A Path To Your Door" ~ Ralph Waldo Emerson
Please don't give Google anymore wackey ideas - their
cup runneth over
Unfortunately, I believe that even if your alogrithm is better than Google's, new entrants are now locked out of the search market.
This is because many webmasters will quickly add a new unrecognised crawler that starts to crawl their site to the "go away" list in their robots.txt file. And if you ignore the robots.txt, then they will block your ip address/es.
Therefore, although this makes me very sad, I believe that the only potential candidates for impacting Google's current near-monopoly of search must come from the pool of current search players.
>This is because many webmasters will quickly add a new unrecognised crawler...
Few even have robots.txt
Or, maybe, they could base results on your previous searches. This way the search results would tend to, over time, match your specific interests. :-)
The KEY to good search is HUMAN EVALUATION of websites & pages.
And more than that, TRUSTED human evaluation.
The problem today is that Google robots can be fooled by unscrupulous, greedy a*oles.
So, you either get (a) spidering robots as smart as humans (20 years?), or (b) you use trusted humans to evaluate sites and pages.
Since Google is ULTRA-RICH, I suggest they hire 20,000 trusted reviewers, and have each website reviews by 2 people (to avoid kickbacks & fraud). I would expect that a trained human could give a "quicky" evaluation of a site (only to make sure it's not a scam) in less than a minute.
So assume that a human could evaluate 50 sites per hour, times 7 hours per day = 350 per day, times 20,000 reviewers = 7,000,000 per day. Divide by 2 due to two-person checking = 3.5 million sites checked per day.
That's not too bad an effort to remove the SCUM creeps.
I also doubt enough people will be willing to submit the required data about themselves. Just think of that related discussion about AOL publishing that data-set of searches.
And MHes hits the nail on its head: Amazons algos have absolutely given up to identify my profile, they tend to present me products related to my last order now. As a matter of fact, I would even regard it an insult meanwhile, if a person (or even an algo) claimed to "understand" or categorize my personality. (I'm trying to do so for ages now;)
|Therefore, although this makes me very sad, I believe that the only potential candidates for impacting Google's current near-monopoly of search must come from the pool of current search players. |
Disagree. Hosting similar to the SETI-framework, algo-improvement similar to linux or any other open-source-project, additional human reviews or filters in analogy to dmoz or wiki.
Even if your techniques yield "better" results, hardly anyone will hear about them. To google is to search ... Yahoo is a life engine ... you get the idea.
I'm a little surprised that nobody has mentioned that this sounds like behavioral targetting, isn't new, and is already "out in the wild", and yes, it "may" be the coming thing.
Amazon is already using this technology for ad display.
Other companies are going in that direction.