|The Future of Search Technology|
John M. Lervik, CEO & Co-Founder, Fast Search & Transfer, Oslo, Norway
I waited a while for this. Interesting presentation talked about a lot of interesting things. From lematising to different ways of handling searches. Like splitting a search into its elements:
a search for Brette Tabke Biography
(container and head repectively):
Lots of Biographys but not that many Brett Tabke's about
This is a 2.3 meg file (dial up download warning :))
Wouldn't mind knowing his AIM number :)
Thanks, should be interesting.
Do you think search engine technology will evolve to prevent spamming... bit of a never ending job?
Or technology will be cuter and filter the rubbish - not just spam.
Who is already leading?
A very interesting presentation indeed.
Lervik made this presentation only one month before he sold off the websearch unit to OV. Judging by the the way he looks into the furture, building on ideas already pursued with ATW, I can't imagine the Fast people were all happy to abandon the websearch arena.
The explanation for that decision may be in one of the slides:
Lervik estimates the revenue/cost situation between portals/destinations, paid listings providers, and search providers:
For 2002 he says:
$1,4 Billion revenue from advertizers to Paid listings providers, mostly PPC (OV, Google)
split 35/65 between providers (OV, Google, INK, Fast)and destinations (Y!, MSN, Google, Lycos etc).
Only 5% go to search providers via the portals/destinations.
Costs are split 45% portals/destinations, 30% paidlistings providers, 20% search providers.
Basically those figures lead to the conclusion given by Lervik in his CEO statement looking back at 2002 and the selling off of websearch:
|In the Internet or Web Search sector, we were one of the top players in a small and low margin business. |
As to the outloook on websearch future, Lervik has some interesting ideas, which basically lead into three directions:
- better understanding of queries, mostly through linguistic analysis
- better understanding of content of documents, mostly by topical analysis
- better presentation of results, enabled by more preprocessing, resulting in more information related to the results
The last point is pretty interesting, as he talks about extracting meaningful terms from documents for refinement, which leads to a serp which presents the user with additional info about the context of the results. Examples of meaningful terms include product names, persinal names, geographical data.
It's also interesting to note how prominently anchor text figures in the presentation, where it says:
Looks like I was right in seeing a lot of anchor text reliance working in ATW's algo lately :)
|Anchor text is the best way to answer general queries in an adequate way |
All in all a fascinating insight into the way a search engine's view on websearch.
If any of this has any bearing on the future of ATW/AV - only time will tell.
Anyway - thanks, Mr Lervik and thanks to the Fast people for building a great search engine. You all have done a fantastic job from 1999-2003. Other than many of the other search engines you have never ceased to develop your search technology further, being on the forefront of websearch all the time.
I had an idea for how a search engine may be able to produce better results. From my understanding, ATW sends all clicks through a script. What if they looked at the IP address for each click, and looked out for the next time the user with that same IP address either searched for something using ATW or clicked on a link from an ATW SERP? If the user uses the engine five seconds after seeing that first page, chances are the page was of little or no use at all. Of course, this doesn't take everything into account. The user may have seen the page on another occasion, there may be more than one user at the IP address, the user may be using other search engines, etc. It just seems that measuring the interval between clicks might be a useful part of the mix.
See hotbot/directhit. A concept which didn't really cut it.
ATW does CTR mesurement to a certain extent. As does AV. As does Google. The question is what to do with the data?
I don't think it's useful to make CTR a direct part of the ranking.
But it could be a tool to help the engines understanding user queries better. Comparing the query with the clicked listing might provide insights in the intention of a query.
Another useful set of data retrieved by CTR measurement would be how deep users go on serps.