Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: phranque
I mean the bad sites. How could you ever devise an algo that could clean up the SERPS? Not all the sites are using duplicate content. It doesn't take too much imagination to figure out ways to churn out lots of pages without lifting someone elses text. I don't even think that it's possible to filter out the bad sites without somehow getting user input.
One thing I'd like to see used more is people's bookmarks. New services that let you share them with your friends could be mined as recommendations. Bloggers that share links can be another source, as would be fora like Google Answers and WW.
I'm sure someone will think of a deceptively simple way to do this or use another form of metadata. It might be Google, or it might be some new upstart- these days I think there would be room for another player.
How could you ever devise an algo that could clean up the SERPS?
Convince the world that having almost all access to web based resources filtered through a tiny handful of companies seeking to find ways to maximize their incomes in whatever way possible is not a good idea.
Get major funding for an open source search engine, ideally through a mix of business and government funding. Make the algos as bullet proof as possible, obviously making your algo open source is a challenge since the spammers have access to it, but that would also be its strength, none of this guessing about what's wrong, or how to fix it, you'd know, and it could get fixed.
There are small projects like this, but none have the funding to create the huge server farms needed to actually actively spider the web.
I see this as the next major goal for the open source movement now that Linux is pretty much here to stay.
I don't know. All my sites get hit by Gigabot. I think redundant storage and returning SERPS is why Google has all the servers. I built little spiders with Perl and LWP that were lightning fast.
An open source search engine is probably a great idea if there was some way, maybe with a browser plugin, to gather information on what sites users stay on for a long time, but scripts could probably written to fool that too.
But given that the new msn, yahoo, and google serve up pretty similar results, I'd say it's really not all that hard writing a decent search engine algo, not easy, but not impossible.
Having your only real access to the web come through for profit corporations, I don't know, there is something really troubling about that, there should at least be one real alternative that is viable to help force things towards a bit of honesty.