Page is a not externally linkable
- Microsoft
-- Bing Search Engine News
---- Microsoft tries to one-up Google PageRank


httpwebwitch - 6:18 am on Jul 26, 2008 (gmt 0)


I finished reading through it - all of it. The BrowseRank algorithm is a thing of beauty, and their methods are brilliant. This may not rock the world, but it may finally give Microsoft a pretty decent search engine.

As for the spying, I suspected Google of doing this with their toolbar a couple of years ago, but I never found evidence. My reasoning was highly conspiratorial, in seven points:

1) it's possible
2) collectively, they are very smart
3) a smart person would figure this out
4) it would make their SERPs more relevant
5) they would benefit from it
6) they have the means to do it
7) no one would know

If my suspicions are correct, Microsoft has that IE browser doing their spying and sending session behaviour data back to their data centers, which gives them vastly more reach than the limited # of people running the Googlebar. (And significantly higher adoption than Alexa, Stumble, and other toolbars)

So where'd they get the data?

same source,
page 5:
We used a user behavior dataset, collected from the World Wide
Web by a commercial search engine in the experiments. All possible
privacy information was rigorously filtered out and the data was
sampled and cleaned to remove bias as much as possible. There
are in total over 3-billion records, and among them there are 950-
million unique URLs.

page 6:
we also obtained a large dataset from the same search
engine, containing 8000 queries and their associated webpages.

The data they use seems to consist of session requests, sort of like server log files. But if they are using IE to spy on people, they can get more than merely a log of HTTP requests. Once you start snooping in and recording people's browsing sessions, why stop there? Surely you'd glean interesting data from other browser behaviour, such as:

1) time spent with the browser window or tab focused
2) keystrokes per page
3) on-page interaction, like interaction with Flash or Media players
4) mouseovers, mouseouts, focuses and blurs
5) pages people put in their Favourites or Bookmarks
6) words people enter into forms
7) pages that are open simultaneously in tabs
8) sites that people tend to keep open in a tab all day
9) pages that do a lot of AJAX async requests
10) pages hiding behind authentication
11) names of people you know
12) your address, phone number, shoe size, bank account balance, sexual fantasy preferences...

need I go on?


Thread source:: http://www.webmasterworld.com/msn_microsoft_search/3707418.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com