Forum Moderators: Robert Charlton & goodroi
Think about it... the Google Toolbar, Google Analytics and click monitoring on the SERPs give Google an incredible picture of where people are going, what pages they stay on, what sites they frequently return to and where they go when they leave.
We know that Google is pushing the toolbar onto consumers. They're paying Dell a billion dollars to install it onto 100 million consumer PC's. Imagine what the behavior patterns of 100 million Internet users could tell Google about a particular site's value.
What scares me is that this will push the blackhats from link spamming over to the busy spyware world. Imagine if I could pay some shady company to have the web browsers of 100,000 pc's randomly click on my #10 ranked link and stay on my site until Google decides that I should be #1. Who cares if these users buy anything on my site. I just want Google to THINK that they're using it. Will Google start bundling anti-spyware with the toolbar to stop this?
Am I on to something, or has this been going on for years?
[edited by: tedster at 8:38 pm (utc) on April 6, 2006]
"Leverage implicit and explicit user feedback to improve popular and nav queries"
What do you think they mean with "Implicit user feedback" else than their clickstream analysis?
If you deserved a zillion links, you'll have a bunch of traffic following them. If the links are all on your (& friends) sites, or hidden, the traffic won't be there and you'll stand out like a sore thumb waiting to get whacked by the algo.
Instead of basing everything on traffic, think of it like a pesky little sister ratting you out to mom....
Second, it requires a ton of computing power to easily keep track and store this in a database. If the people who have google toolbar surfs for an average of 35 minutes a day--just imagine how many billions of pieces of data that'd be per day. It could be even more depending upon what type of info they seek to store. Also just look at Alexa. A toolbar in itself cannot provide the usefulness of info nor effectively help organize a gigantic index. The toolbar is only installed by certain types of users and thus presents a number of statistical problems.
I just think that when you get down to it, some of this can be used to effectively prune the index but I think this is only a small piece of many layers google uses to accurately filter and interpret information.
1) Dell computers ship with network cards
2) Network cards each have a unique MAC address
3) Dell, if they wanted, could keep track of MAC addresses per computer
4) Dell can keep records of customer name, address and so on
5) Google gives Dell billion bucks
6) Dell not only installs Google toolbar but gives Google customer info including unique MAC address
7) Google now not only knows what sites a user visits, but could possibly know their name, how much they spent on their computer, where they live, what software options they have installed, and so on.
So Fred in Idaho just bought a $3,500 Dell but didn't opt-in for anti-virus software. They also see that Fred has chosen a dial-up ISP. So Google starts by showing Fred ads for broadband in his area, some good anti-virus software, and some cheap vacations near Idaho.
I think Google needs to find a really good way to combat this issue before allowing it to have any major impact ON SERPS.
Nielsen Ratings note [tv.yahoo.com]
"There are an estimated 110.2 million television households in the USA. A single ratings point represents 1%, or 1,102,000 households for the 2005-06 season. Share is the percentage of television sets in use tuned to a specific program."
Billions of $$$ worth of advertising are spent based on that word "share." This Dell deal will allow Google to speak to Madison Avenue suits on the same terms that TV execs currently do.
Additionally, a site's "authority/hub" status could be determined with far greater accuracy than is currently done. In other words, myspace is a social authority/hub site simply because a few billion users say it is. (It sure isn't because of its content)
Whatever it is, it underlines the basic open questions:
- What exactly does the toolbar to the browser?
- Has anyone sniffed in detail what information is sent by the toolbar while surfing?
- If bandwidth and capacity do matter, one of the most valuable information to be sent definitely would be the fact that someone adds a website to his or her favourites.
I think it will eventually play a role in things, but I don't think a company who can't figure out how to handle a 301 has an algorithim based on user behaivor patterns.
Since 99.999% of google users have google cookies allowed, they can use the tracking javascript they run off and on to see how users react to various serp combinations, but only in the context of searches, and returns to searches, which means a failed serp result.
Obviously google wants to get out the results that people don't use, that's just common sense, and cookie based click tracking is how they do that.
The toolbar lets them track user behavior much more actively, why anyone installs that toolbar is absolutely beyond me to be honest, especially seos. Same for allowing google cookies when you test serps.
It's not hard to track this stuff, the data is very simple, cookie id + url + search, it's fairly trivial I would guess, session tracking isn't exactly the hardest thing in the world to do, especially with the server resources google has.
The question is not why would google do this, it's why would they not do it? Yahoo and MSN do it, yahoo always tracks with hard coded redirects, no exception, can't remember how msn does it, but they all do it. Google just does it in a more subtle way, they even do custom browser based tracking. As with most stuff in search, google does it better than their competitors.
why would they not do it?
I am positive they use traffic patterns as part of their ranking algo, but since you asked...
While at the Meet the Engineers in New Orleans I asked several questions about this and was told that Google couldn't use click data directly because of a previous patent, but they were using the data in some other fashion (which he wouldn't explain) and they were working on more ways to use the data.
I don't have the time to look it up, but does anyone know who holds the previous patent?
Clearly google would not have included the tracking component in the patent application if they hadn't felt that they had found a way to work around the patent the engineer refered to.
Good information, this has to be one of the very first times I've gotten a real answer to what was intended as a rhetoricial question here, thanks!
Google gives Dell billion bucks...Google now not only knows what sites a user visits
Google is definitely collecting data--and with the latest toolbar update they're probably collecting even more data (considering that when you turn on the PR option they give you an additional popup about privacy).
Having the Toolbar placed in 100 million consumer PCs would allow Google to determine site popularity
Didn't we used to call this "temporal clicking" as was used by direct hit and hotbot?
An iterative loop based on the users. I call it brilliant, as a check on the foundation they have already engineered. Its a heck of a number crunch, but they have the power to do it. Eventually, it may even be done close to real time.
IMHO, this will not supplant PageRank, or linkpop, but will overlay with the vote of the searching public - who by all means should have important input. Not primarily other sites and webmasters as was the case with previous algos.
Once again, it still comes down to relevance and content. Google will use the searching public to weed out less than stellar sites. (Language ambiguity and inept searches will need to be part of the "fudge" factor. I imagine a huge amount of sites caught in the crossfire on this.)
And once again, commercial sites will have to provide content enough for the information seekers to keep them on the site to avoid triggering "irrelevance" penalties from those not interested in purchasing.
AND ...once again... Brett's concise two words show wisdom and insight.
"Florida update" :)
1. How many people went to that site
2. How long did people spend on that site (or page)Without #2, #1 is meaningless.
And I would add at least #3 - How many went back to that site within xx days.
#3 and #2, in my mind, is the most important measure of a website's relevancy.
#1 is a measure of how well it is presently doing in the serps (or how well they are marketing across the internet), which, in many cases, is not a measure of quality/relevance.
If google focused weight on #2 and #3, I can't think of a downside, except for webmasters who spam and scrape - these guys would have to think about actually putting some effort into their sites, beyond SEO.
With this in mind, Google's move towards increasing use of traffic patterns and data supplied by consumers seems inevitable.
Vote buying is rampant in many countries. Influence is bought by lobbyists .. has nothing at all to do with democracy ...
Link buying is rampant in cyberspace. It's all about money and getting more so.
However, vote buying is impossible in a dictatorship. Hmm ..
1. How many people went to that site
2. How long did people spend on that site (or page)
Is there any relationship between ranking in the top 10 on matching search word & phrases and time on site?
People using search engines tend to spend less time on sites as they search around compared people going to a site with a direct link like a bookmark, email or newsletter or a link from a related site.
So I would think the pages ranked highly would get shorter visits. Am I was off base here?
What I'm leading up to is that pages at the top in the serps may actually lose if the serps are based on time on the site.
This will make it virtually impossible for the little guy to succeed. The top 10 results will be based on the well established sites.
Wrong.
A question nearer and dearer to me currently is: Does traffic influence Googlebot visitation frequency and, if so, to what degree?
eg. Surfer searches for blue widgets.
Clicks on the first result. Doesnt find the info they are looking for. Goes back to Google and clicks on the second result. Finds the information they were looking for and doesn't continue looking through the rest of the results.
Thus Google knows that result 2 was a better match than site 1. If this trend continues they may switch results.
It is auxilary to the normal algo. In that the normal algo would provide a starting point and then google would "tweak" results based on the user interact.
This system will help remove spammy sites and increase serp quality overall.