I guess it's too early to cast a real judgement as this is obviously a work in progress. It will be interesting to see if they can build the community that this project needs. I for one wouldn't be keen to invest time in helping Mr. Wales get rich with advertising.
A quick comparison to Google:
* both engines cannot count. A wikia search that returned 3 results said "Results 1-5 of approximately 5 for [..]". At the bottom there is a button "Results 4 to 13" which doesn't return any extra results.
* wikia does not put wikipedia articles at the top of most searches like Google does.
Another thing is the use of the #-character in the URL to indicate the query string. They're thereby ignoring the widely accepted standard of using # for indicating page fragments and the question mark? for indicating query strings.
Disclaimer: I am a "competitor", though I do not view them as "competition".
I also find it amusing that there is no big public credit given to Nutch on their site - despite its shortcomings Nutch has certainly got far more search features than any WWW scale search software developed by Wikia itself, which to me appears to be non-existant. Naturally giving public credit to them would probably take off some gloss over what Wikia search is. Or maybe they just did not think of it as a right thing to do - either way when you base your main work on someone elses' (freely taken) then it is fair to give them credit.
Ok, there are a fair few examples where some company takes source code and improves it in such a way that a new great product is created - for example Half-Life (game) was created using Quake 2 engine, however it was very highly modified so much that the final work was awesome. Right now it is not clear what modifications (if any) were made to Nutch, but I suppose we will find out soon.
To sum up my mixed feelings I think there are some great new well implemented search engine projects (I don't mean us here - say powerset is very interesting), yet they won't get a tenth of publicity that Wikia search had received even before they shown anything. Life is so unfair, eh?
It's obviously very early days. There are no advanced options, so it's not even clear what the advanced operators are. Is there a site: function, define: or link:?
|We are aware that the quality of the search results is low. |
And they are. This project has a long way to go before it's worthy of our attention. Because who knows whether it will ever get to the stage of rivalling any of the other major engines in quality or traffic? And if it doesn't, time spent trying to rank on it will be time wasted.
Yikes. I did a few keyword searches for my niche. The top spammer/scraper sites showed up #1 and #2 for all the searches I did. The top 3 "legit" sites in my industry were not on the first page.
This is where engines like Yahoo and Google have a major advantage. They know the history of the web for many years and can see patterns of abuse that a new comer will need to learn. It may take years for them to catch up, if ever.
it is broke for me tried to add profile picture when I did a search for a common keyword no results so it depends again on articles submitted by others
I gave up as it took longer that a few seconds to load the results. And as someone else pointed out, this is a profit venture for Mr Wales, does he seriously expect people to help him for free?
|I guess it's too early to cast a real judgement as this is obviously a work in progress. |
It is never too early to cast a real judgement, especially in this instance.
I performed 10 searches, the ones that I would perform on Google, Yahoo! and Live periodically just to see how close the three are with their algos. Out of those 10 searches, not one of the sites I normally see in those top 10 results are there. In fact, what I do see, is quite a few results in Chinese and Russian. For one search, 7 of the first 10 were in "other characters" besides English.
Nothing to see here. I guess we'll all keep waiting for the "next" search engine to arrive on the scene, it surely won't be this one from the looks of it.
Oh I feel sorry for these guys -- right now it is just a terrible place to get anything done at all. I admire people with ambitions but sometimes it is hard to distinguish a fool from a visionary.
The results were terrible for everything that I searched on--not just worse than Google, but far worse even than Yahoo or MSN. I'm a fan of Wikipedia, but creating a spidered search engine is a whole different kettle of fish than creating the infrastructure for a user-edited encyclopedia.
|I also find it amusing that there is no big public credit given to Nutch on their site |
If they are using nutch, then they won't get far as many block this UA. Better come up with their own UA.
|If they are using nutch, then they won't get far as many block this UA. Better come up with their own UA. |
I think this is the least of their worries as Nutch allows change of UA.
The issue really is not to crawl billions of pages, but to actually make sense of them to show Top 10 very good results from such a big index. This is the hard bit and Nutch itself does not solve it - one would have assumed that this is exactly where Wikia can help by throwing money they raised from investors into exactly this kind of work, but so far it seems that Wikia just took Nutch and added some GUI on top of it.
New fancy GUI does not win the market - A9 (from Amazon who incidentally listed as investor in Wikia) used Google's code and database (under license), yet even though they did not have problem with the relevancy, they had problem of differentiation - if people get no better result than on Google then they won't switch, no matter how fancy GUI is.
Obviously this will be interesting to watch since it will give many folks an insight into just how hard it is to create a greate search engine in today's environment.
If you look to MSN, clearly they still have pretty big problems and they launched nearly 2 years ago.
The nice thing about this particular launch is that win or lose, it will keep the existing search engines on their toes.
It looks like search engine amateur hour has now entered the spring season.
It uses GRUB running out of swlabs.org to get their index:
18.104.22.168 “Grub/2.0 (Grub.org crawler; [grub.org...] email@example.com)”
Now at least you can block it if you want.
After the Wii, which I think is the only games console making a profit on the console sale, I'll never write off anything. I've just done a few searches on wiki search and the results are truly rubbish. But the right hand side of users associated with the search is a bit intriguing. Going back to the Wii analogy, they have just one unique feature, the hand control. If wiki come up with a similar suprise feature, who knows.
One search I did showed both my retail site in the US and the site for a photo processing lab in Uganda in the top 10. If I had to pick two things that were less relative to one another, I don't think I could.
- Cache pages crawlable, indexable
- Bugs in counting the results
- Most stupid URLs ever
- Fully packed with spam and MFA *prior* to launch. They didn't even have to aim to be there
- Hit the same result-extend buttons to add the same sets many times
- Open source SE? ( Nutch )... is that like an MMORPG where SPAMmers and MFAs battle webmasters and Wiki users?
- Doesn't rank my sites so I don't like it *grin*
- p*rn results for travel phrase. Whoa... how did those f@antasy R-@pe listings get there?
- No understanding of languages whatsoever
- More p0rm... oh geez! I have my filter set on strict for a reason!
- Oh! These nifty little numbers... NOW I think I get the relation...
SO... are Wikia's ...*cough*... wide open so that its Nutch can be seen in all detail on PURPOSE? I mean the 'breakdown of scoring' pages, "boost" factor... letting everyone know about the intimate settings of their version of the Nutch Web Search Engine [lucene.apache.org].
- Meanwhile on Wikipedia: "Thank you for donating" ( phew, so they did gather the funds, what a relief! )
- SLOW Slow slow slo oyaaaaaawhn..... *zzzzzzzzz*
*dreaming that somehow... somehow after
sorting out fraudulent sites, spam, MFA and prom
sites Wikia WILL get a fair share of the SE market
even with its buggy little version of an open source SE*
[edited by: Miamacs at 10:43 pm (utc) on Jan. 7, 2008]
Wikia Search is pants.
|* wikia does not put wikipedia articles at the top of most searches like Google does. |
The first thing I noticed too and it is a really good feeling to use a search engine where wikipedia entries don't push other articles to the third position or even lower.
But that is about the only positive I can say. One niche specific keyword returns 1550 entries on Google and 1280 on Yahoo, but only 4 on Wikia. Those 4 are all from my websites, so actually I should be happy with this result ;) but I am not. According to this figure the reach of the crawler seems to be only 0.3% of the reach of the big boys.
A keyword for another niche returns webpages in all types of languages: The first ten were: English, Russian, French, Korean, English, English, Russian, Czech, English, English.
It seems that they want to return a result in the native language of every volunteer working on the wikia project :) but that is not exactly how it should work. Especially not because the best websites for this topic are in German (not mentioned at all) and English language and certainly not in the other languages mentioned.
So much to do, or probably just another DOA project.
It is an Alpha index apparently. So it is very far from a real index.
Even so, the whole link based approach to search engine crawling seems to be a 1998 problem rather than a 2008 one. I think that a link based strategy is, given the state of the modern web, broken.
It may be interesting when it gets a real index but I think that it requires a lot of time to develop.
|I think that a link based strategy is, given the state of the modern web, broken. |
There is no viable algorithmically scaleable alternative to it - it may have been possible to avoid using it in 1998, but in 2008 there is way too much data to avoid result discrimination based on links: when number of matches is significant, which in most cases it is.
Tried voting five stars for one of my sites (which did well in couple of searches; some odd sites in the mix), only to learn that feature not working yet. sob :(
I trust Wikia search folks will scan thro this thread; then, can draw up a to do list, and fix the troubles in a jiffy...
Why even release something like that? Even in Alpha form. They are doing more harm than good. You don't throw something like that to a pack of wolves like us.
I just did five more searches for popular phrases I track. Man, someone sure figured out how to sub-domain spam your index. Nine out of the ten results are from the same root domain.
I say leave search up to Google for now. Take your alpha offline and go back to the drawing board. When you think you are ready, go back to the drawing board again. :)
From the Wikia Search about page:
Popularize it by not mentioning its name anywhere?
|Wikia is working to develop and popularize a freely licensed (open source) search engine. |
Joking aside, I really hope this thing works well. The idea behind it is warm and fuzzy.
I was toying with the idea of using Amazon Web Services' (Alexa's) search results to build a similar Wiki style search engine. At very least I'd have started off with much better search results. Alexa doesn't give great search results, but at least they're usable.
I've put the idea on hold for now because I realized it's such a cliff hanger. Wikia will need a huge number of voting users to actually make a difference, but how are you going to get enough users to use a search engine that sucks? How many votes are they going to need to rank 12+ billion pages in the say 10 billion most common search phrases? Even if you can get a massive following of voting users, will they ever be able to compete with Google's years of algorithm development along with their search and "Google Analytics" data?
At the end of the day, for there to be another Google style (Search only, no existing web portal) rise to fame for another search engine, it's going to have to be as revolutionary as Google was back when it started. How is a search engine that states that "the quality of the search results is low" going to anything near revolutionary? Wikia can't be aiming for anything short of revolutionary search results. They need that critical mass. Will users really start a mass exodus from the big 3 just because Wikia search is warm and fuzzy? It's not like we're paying to use Google.
I did a few searches as well, they have a long way to go.
Two very small points, so going small
When is the donate button [wikimediafoundation.org] to the search engine coming, h?
I hope they don't really read this thread anytime soon, or they might give up in few days time.
The initial implementation may indeed be amateurish, but the underlying concept is intriguing. Can a credible ranking mechanism really be powered primarily by kumbaya?
Interesting to see Wikipedia and Google squaring off again. Accomplishments aside, Wikipedia seems more true to their slogan ("Be Bold") than Google is to "Don't Be Evil".
Take that, Knol!
|Tried voting five stars for one of my sites |
I think that comment sums it up nicely.
Multiply that by zillions of other webmasters all voting for their own sites (understandable) and organised spam teams doing the same and its a short matter of time before the serps are as useful as a chocolate tea pot
I won't write it off just yet, im all for new projects and new ideas but you cant reinvent the wheel, Should wiki search gain any traction what so ever in the market spam will be a major problem for it, currently it doesnt have enough data to offer any search service so no idea how long it will take to collect the base data before it even gets to sorting out the good from the bad.
Also, i just dont see anything different or anything that makes me think "wiki it" - its just another search engine only with less financial backing and market share than msn or possibly even ask jeeves?
Well according to the interviews, it is aiming for a 5% market share of the search business. It is very much a work in progress but how it will work out is anyone's guess. The big difference from other alternative search ventures is that it is funded.
|The big difference from other alternative search ventures is that it is funded. |
I think the biggest difference not that (fairly few search engine projects were funded very well), but the fact that this venture actually uses stuff that is developed by others (mainly for free, ie Nutch).
In a way this is like Red Hat of Linux, only Red Hat has added value - insurance (tech support).
And it wants people to work for free too.
|I think the biggest difference not that (fairly few search engine projects were funded very well), but the fact that this venture actually uses stuff that is developed by others (mainly for free, ie Nutch). |
I am still trying to work out if social networking can be applied to search. It looks like it is taking the idea of an edited directory and applying it to search. However when searching people do not necessarily want to know what people think of a site or a topic. They want to get to the relevant site(s) as quickly as possible. That's the fundamental test for the quality of any search engine.
|In a way this is like Red Hat of Linux, only Red Hat has added value - insurance (tech support). |
| This 40 message thread spans 2 pages: 40 (  2 ) > > |