Forum Moderators: bakedjake
The Times, December 23, 2006Founder of Wikipedia plans search engine to rival Google
James Doran, Tampa, Florida-Amazon.com is linked with project
-Launch scheduled for early next year
Jimmy Wales, the founder of Wikipedia, the online encyclopaedia, is set to launch an internet search engine with amazon.com that he hopes will become a rival to Google and Yahoo!..."Essentially, if you consider one of the basic tasks of a search engine, it is to make a decision: 'this page is good, this page sucks'," Mr Wales said. "Computers are notoriously bad at making such judgments, so algorithmic search has to go about it in a roundabout way.
"But we have a really great method for doing that ourselves," he added. "We just look at the page. It usually only takes a second to figure out if the page is good, so the key here is building a community of trust that can do that."
...Catching up with Google, Yahoo!, Microsoft's MSN or even smaller operators such as Ask.com will be a difficult challenge, Mr Wales conceded.
[business.timesonline.co.uk...]
[edited by: tedster at 12:08 pm (utc) on Dec. 23, 2006]
[edit reason] fair use of copyrighted material [/edit]
It would be very web 2.0 of them.. They could even pay, they have the cash.
Regardless, Jimmy wales needs to worry about the real problem, and that is: is the web page relevant to your search query and does it load up the page extremely fast..
You need to build super computers next to hydro electric dams with software written by 100s of phds to compete in that area, I'm afraid. Web 2.0 communities aren't going to do the trick.
That way, Mr. Wales would need a community of 1,666,667 volunteers working for his for-profit search engine. That doesn't sound very realistic to me.
Lets see, put a page full of content, submit it, get approved and in the index, then change the content on the page. Thats what the spammers will do.
What, are they going to check every page every day? Yea, ok I have some land in florida to sell you and a cheap bridge in New York...
Any takers?
Mr Wales said. "Computers are notoriously bad at making such judgments, so algorithmic search has to go about it in a roundabout way."But we have a really great method for doing that ourselves," he added. "We just look at the page. It usually only takes a second to figure out if the page is good,so the key here is building a community of trust that can do that."
One wonders where he is going to get this community that he can trust.
ODP never managed it, Wikipedia has not managed it.
The need for volunteers will always create communities who are either self interested in promoting their own sites at the expense of competitors or power hungry in advancing themselves in the community.
In addition sites could be reviewed as a pyramid - once a spam page is found on a given level indexing stops for a period - until problems are fixed.
Let's not forget that it's commercial search that pays the bills, that is, it's income and profits - from advertisers - ultimately it's commercial or business search - that makes any 'search platform' viable.
What happens come the day that the CGM 'producers' - the free labor market - awaken to the idea that their noble efforts have actually been rendered in service of someone else's profits and elaborate lifestyle? Once nobility as motive is eradicated due to avarice of those seeking to exploit someone else's good nature and good will what's left? Ego gratification? Once it is realized that ego doesn't pay the bills for long then what?
Next up: Communal Media, where the producer's of value=content own the company=content, inasmuch as 'the company' and 'the value of the company' IS the content.
Writer's commune? Publisher's commune? Communal media will be the outcome, where collectivization of people with common publishing interests and community ownership of the media make perfect sense, particularly in the age of disintermediation. (In this regard, the move of Adsense towards supporting multiple accounts within a community platform is a step in that direction. A few more tweaks in the model and the "fuel" for the movement will be in the pipeline.)
Publishing platforms are now a dime a dozen so who needs a venture capitalist to suck the profits from one's creativity?
Give it time. Efforts to extract all the golden eggs from the goose invariably kills the goose. CGM is the golden goose. The goose will evolve to survive, and instead of the goose dying it will be the attempts to suck value from the goose that will begin to die.
Geese of the world, hear me! Unite and let the profit suckers suck eggs from their own hind quarters!
[edited by: Webwork at 5:01 pm (utc) on Dec. 23, 2006]
The hard part is dealing with deliberate manipulation. Wikipedia has seen its share of that, but most of the conflict has been in a small percentage of the topics. When you are talking about a general search engine, there are millions of search terms that have some value, and many millions of pages that might target those terms. The size and dedication of the community needed to do that would be hard to reach.
Isn't it true that anyone is free to employ or deploy all the content of Wikipedia so long as there is attribution according to 'the rules'?
So, what's to stop anyone else from doing the same thing as the founder plans to do? If that's the case then isn't there a bit of a 'success due to competitive advantage' problem?
You mean the competitive advantage is that all those editors will just tag along once their efforts are commercialized and everything will just run as smoothly as ever? Seems like a sweeping assumption. What if a whole host of editors - seeing the commercial handwriting on the wall - veer off to for their own collective and mutual benefit society?
What am I missing?
[edited by: Webwork at 5:08 pm (utc) on Dec. 23, 2006]
I think we can all agree that SE's will Always be worked over by certain individuals to their advantage. They key for SE's is to limit that corruption of listings. Google can't do it 100% right, so why should we expect this new "Amazon" engine to do it? Any and all competition to Google would be good. Good competition is good for users and advertisers alike (possibly not good for the SE's though).
Cheers,
Dave.
If, let's say, over 30% of the page's content has been changed, a human would have the opportunity to re-check that page to make sure it hasn't been changed for the worse.
So newspapers, magazines, online newsletters, blogs, jobs sites, classified ads sites, personals sites, etc. would all need ongoing reviews.
FarmBoy
[webmasterworld.com...]
But ranking a pages quality is only one part of a search engine's job, and not the most important part. I'll take relevance to my specific search phrase over quality any day.
[edited by: jomaxx at 6:32 pm (utc) on Dec. 23, 2006]
I was with DMOZ when it still had a mere 200 editors (and at the time still known as newhoo.com) - I have seen its erratic evolution throughout the years - and can assure you that anything based on such a premise is doomed to fail, by definition - because the immense size of the web is beyond any kind of human handling.
in view of past experience (remember also infoseek zealots?), it is beyond belief that anybody in his right senses may visualize human intervention in search on a great scale.
heisje
.
You need to build super computers next to hydro electric dams with software written by 100s of phds to compete in that area
Or so Google would like us to think. IMO, Google hasn't used those PhDs very effectively. Nor have they used much of their talent pool very effectively - most people at Google are over-qualified for the work they are doing.
They need to have them working on semantic analysis, which is the only way to move search forward from here. Instead, they are trying to wring every last drop out of the tired concept of link analysis, and have the masses cowed into so thinking that keywords are the best we are ever going to have that it's starting to affect language, as we start to lose conjunctions - and, unfortunately, meaning.
Lets see, put a page full of content, submit it, get approved and in the index, then change the content on the page.
You simply re-visit, and queue for re-review if there have been significant changes in the page. Do this enough times, with a clear pattern of abuse, and the site gets banned.
But I'm wondering why one would do this in the first place. If one HAD good content in the first place, why on earth would you replace it with bad content?
They key for SE's is to limit that corruption of listings. Google can't do it 100% right,
Or doesn't want to.
-----
There are three huge problems with search as it exists today:
(1) Relevancy of results is really very poor.
(2) There is little or no understanding by the search engine of the semantics of either the search or of web pages. It's amazing how well search does work, considering the the search engine doesn't know either what the user is searching for or what the web pages are about. Keywords are a nice parlor trick, but it's time to move on to the real deal.
(3) Search engines need to evaluate trust, legitimacy, viewpoint, motive, etc. etc. and match those with the needs of the searcher. We've progressed very little along this line, with the sole move forward being link analysis.
The first two of these I think can be eventually handled completely by computer, most likely with human "training" involved. The third almost certainly requires much more human involvement and probably the invovement of the public at large. Fan or not, you have to admit that Wikipedia is the biggest and most successful project to date along these lines.
I wish somebody would seriously take on the first two challenges, but I applaud Mr. Wales for taking on the third. It's certainly the one he's most qualified to tackle, and I wish him the success he has enjoyed with Wikipedia.
I like the idea of human-rated pages. But let's do some math:
* say there are 10,000,000,000 web pages
* for a page to be rated reliably, at least 5 ratings are required
* a page needs to be rated at least once a year
* one person can rate 100 pages per day, 300 days per year.
he doesn't have to rate pages - he can rate sites. And then maybe 2000 people could do the trick.
But: there's money in the game, so there will be many woolves out there pretending to be goats - and make sure their sites don't "suck".