Forum Moderators: bakedjake
You're pitching to a bunch of tech savvy people by asking here, however - many "joe publics" like my family, for example, rely on my judgement on what search engine to use. I say this, they use "this". I say that, they use "that" because I work in the industry, they trust my opinion.
Despite magazine articles, news stories or commercials - that's more than likely what they'll always use.
The question perhaps, is better put: how does the average user decide what they want in a search engine?.
For that, I've got no idea, but it's a good idea to follow the leader, so to speak, and put polish on the veneer laid before you instead of reinventing the wheel.
There is an entire industry based around the "faith" of search engine optimization as tons of people have there own beliefs in how things work.
My search will be the first with a published spec on exactly how things work with the philosophy that an open architecture with open minds and public scrutiny can strive towards relevent results at a much better pace than anything proprietary.
Sure spammers will be able to see how to optimize but i think in the long haul with the openness of the system it will be easier to develop a process that doesn't lend itself to the spammability of some of the proprietary systems.
Sorry "SEO's" out there. i'm here to put you out of business or atleast put some science and academia back into search rather then blind faith :)
Nutch / is one of these. So it seems that you're doing a "me too".
Given that your idea sounds noble, what it's in it for you?
Though, really it sounded like you wanted to know what people want in a search engine - "people" don't care what's under the hood - only that it gives relevant results.
For me, and other "SEO" type people, we care what's under the hood because we love technology, and a lot of "us" are the type that took apart the electrical gadgets around the house when we were 3, 4, or 5 just to see how the parts fit together.
Though really, if you are going to do something novel - and have an open architecture, I'd start with some fuzzy logic, an adaptive control system based on fuzzy logic, a neural network to analyze the relationships & concepts that make language more than just parsed tokens, and the wordnet database (as it's free) to give you the basics of your semantic technology.
For the popularity element of the algorithm (because there has to be one) do some localized vector computation based on different types of linkage, contextual elements, etc - that way, it's not pagerank, it's not Teoma's model, but completely dependent on the vector at hand, instead of a massive normalization scheme that can take too much processing power for a start up.
:) Those are my ideas on your backend - what are yours?
So i guess it is a me 2 :)
Were abstracting out the google api on top of nutch, integrating Carrot2, and working on supporting more features (and ofcourse giving the code back to the community)
And believe it or not, people like good results. More then what "SEO's" people credit too.
sure they may not care how it works as far as clicking on explain and understanding what the heck they see, however they DO care that it gives good results and being a good search engine is about catering to both markets as without people finding good results you don't have any traffic and without the webpages to index and publishers working on quality content you don't have any data.
But I favour portal sites over Pure databases any day.
If you have no good results, people won't use it. If an engine is too slow, people won't use it.
And if they can't work out how to use it - it's doomed.
Sometimes all this techno babble just confuses people.
----------------------
You've got Search Engine lovers. Portal Lovers. The Rest?
I think they just can't make up their minds about which one to use. They are probably blinded by science and pretty colours!
I think the most important thing to remember is - life is too short, to debate over who's the best, as none of us will ever own these great gods of reference.
Debating can be fun - but is ultimately self-defeating, as a means of accomplishing anything.
"We don't win once in a while, we don't do the right thing once in a while, we do the right thing all of the time.
Winning is a habit - unfortunately, so is losing"