@Reno, the problem is who has the cash to be crawling petabytes of data a day?
Writing an algo is easy, creating an index of the web is f*cking hard. I'm not sure I see an easy solution, unless someone creates a community driven, open source, distributed crawling project. Either that or the EU competition commission forces them to open up their index!
Facebook is the best challenger IMO. They must be building up an incredible data set based on all the open graph / like button implementation.