Forum Moderators: phranque
Why is that? It seems such an obvious thing to do that I was curious to know why it hasn't happened.
A few barriers do present themselves. I can think of the following, I suspect there are others.
1) A search engine needs a *lot* of CPUs & tons of bandwidth. Big problem this. It could be at least partially solved by using a SETI type system. Your system, when idle, wanders off and spiders a few pages. Perhaps a decentralised system of storing the spider's results could be devised?
2) An open source search engine would be wide open to be gamed by SEOs. That's true, but then the closed source ones are too. One of the great things about open source is the network effect of loads of people collaborating together. Make every user a collaborator. Hand edit heavily gamed SERPS. Shuffle search results so there is no such thing as a number 1 ranking. Pure algorithmic page ranking is doomed to die by a continuous unwinable SEO arms race.
Any more?
[Disclaimer: for the more cynical members on here, no I have no plans to create a community SE. It is interesting that open source hasn't entered such an obvious space]
I don't believe they're that far from releasing beta version 0.6. There are even a few public search engines using Nutch that have already gone live.
More details on the Nutch Wiki:
[nutch.org...]
Creating a crawler and search engine isn't a trivial undertaking. The Nutch engine even in a fairly early state has been tested on a single machine at rates up to 20 transactions per second.
Holmes