| This 107 message thread spans 4 pages: < < 107 ( 1  3 4 ) > > || |
|Google algo moves away from links, towards traffic patterns|
Does anyone else think that Google's actions over the last few years indicate a gradual change in the importance of traffic patterns over inbound links?
Think about it... the Google Toolbar, Google Analytics and click monitoring on the SERPs give Google an incredible picture of where people are going, what pages they stay on, what sites they frequently return to and where they go when they leave.
We know that Google is pushing the toolbar onto consumers. They're paying Dell a billion dollars to install it onto 100 million consumer PC's. Imagine what the behavior patterns of 100 million Internet users could tell Google about a particular site's value.
What scares me is that this will push the blackhats from link spamming over to the busy spyware world. Imagine if I could pay some shady company to have the web browsers of 100,000 pc's randomly click on my #10 ranked link and stay on my site until Google decides that I should be #1. Who cares if these users buy anything on my site. I just want Google to THINK that they're using it. Will Google start bundling anti-spyware with the toolbar to stop this?
Am I on to something, or has this been going on for years?
[edited by: tedster at 8:38 pm (utc) on April 6, 2006]
If we take a look to those slipped slide notes from the Google presentation, we find:
"Leverage implicit and explicit user feedback to improve popular and nav queries"
What do you think they mean with "Implicit user feedback" else than their clickstream analysis?
Instead of basing ranking on traffic, imagine them using it in a different manner. They'll keep as primary elements what has brought them so far, so linking and other relevancy factors will remain. Now they can start at the sites with the most links and work down, saying "OK, you've got 600,000 links and no visitors. Something smells rotten..."
If you deserved a zillion links, you'll have a bunch of traffic following them. If the links are all on your (& friends) sites, or hidden, the traffic won't be there and you'll stand out like a sore thumb waiting to get whacked by the algo.
Instead of basing everything on traffic, think of it like a pesky little sister ratting you out to mom....
i'm sure they already use some of that data to some extent--but I would argue that this only helps in extreme cases. In many industries, its easy to get links which don't convert to traffic very easily. Think about how easy it is to get on a news feed, RSS, or some other form of syndication.
Second, it requires a ton of computing power to easily keep track and store this in a database. If the people who have google toolbar surfs for an average of 35 minutes a day--just imagine how many billions of pieces of data that'd be per day. It could be even more depending upon what type of info they seek to store. Also just look at Alexa. A toolbar in itself cannot provide the usefulness of info nor effectively help organize a gigantic index. The toolbar is only installed by certain types of users and thus presents a number of statistical problems.
I just think that when you get down to it, some of this can be used to effectively prune the index but I think this is only a small piece of many layers google uses to accurately filter and interpret information.
Not sure why this hasn't been mentioned yet, probably because this is a webmaster forum and not a techie or conspiracy theory forum, but here goes,
1) Dell computers ship with network cards
2) Network cards each have a unique MAC address
3) Dell, if they wanted, could keep track of MAC addresses per computer
4) Dell can keep records of customer name, address and so on
5) Google gives Dell billion bucks
6) Dell not only installs Google toolbar but gives Google customer info including unique MAC address
7) Google now not only knows what sites a user visits, but could possibly know their name, how much they spent on their computer, where they live, what software options they have installed, and so on.
So Fred in Idaho just bought a $3,500 Dell but didn't opt-in for anti-virus software. They also see that Fred has chosen a dial-up ISP. So Google starts by showing Fred ads for broadband in his area, some good anti-virus software, and some cheap vacations near Idaho.
I think there is a huge number of Google software deals going on. Google is popping up as the default choice in dozens upon dozens of software packages. Stuff we aren't reading about in the main stream.
Old schoolers raise your hand if you remember when SNAP.com (then owned by NBCi.com) used user bahavior to affect serps, it was a highly abused people had bots that would click on their listing and never come back to get them higher ranked.
I think Google needs to find a really good way to combat this issue before allowing it to have any major impact ON SERPS.
> DirectHit anybody?
Was an 8th grade science fair project compared to Googles' phd level work in this area...
Guys you just gave me a huge new idea!
Thank you =)
This is not the idea I was just flashed with, but if that'll be the case I'll try to do a mild DOS attack on my own sites after I figure out what kind of info is being sent back to google servers.
Having the Toolbar placed in 100 million consumer PCs would allow Google to determine site popularity in much the same way that the Nielsen Ratings are done. No? And with at least the same level of accuracy.
Nielsen Ratings note [tv.yahoo.com]
"There are an estimated 110.2 million television households in the USA. A single ratings point represents 1%, or 1,102,000 households for the 2005-06 season. Share is the percentage of television sets in use tuned to a specific program."
Billions of $$$ worth of advertising are spent based on that word "share." This Dell deal will allow Google to speak to Madison Avenue suits on the same terms that TV execs currently do.
Additionally, a site's "authority/hub" status could be determined with far greater accuracy than is currently done. In other words, myspace is a social authority/hub site simply because a few billion users say it is. (It sure isn't because of its content)
Maybe OT: I reinstalled the toolbar by mid February in order to observe expected changes around this big daddy thing. Ever since then, these little favicon-graphics vanished one by one from my favourites-bar, and adding a new page doesn't create a new favicon. Its IE6. Could be a mere coincidence or a virus, but maybe someone else has noticed something similar.
Whatever it is, it underlines the basic open questions:
- What exactly does the toolbar to the browser?
- Has anyone sniffed in detail what information is sent by the toolbar while surfing?
- If bandwidth and capacity do matter, one of the most valuable information to be sent definitely would be the fact that someone adds a website to his or her favourites.
Although I think they have probably tested things and possibly look at data to see how they can implement it, I think it is really difficult to believe that they are truly implementing this as a major factor in the algo. This is much easier to manipulate than any link based algorithim.
I think it will eventually play a role in things, but I don't think a company who can't figure out how to handle a 301 has an algorithim based on user behaivor patterns.
Obviously google wants to get out the results that people don't use, that's just common sense, and cookie based click tracking is how they do that.
The toolbar lets them track user behavior much more actively, why anyone installs that toolbar is absolutely beyond me to be honest, especially seos. Same for allowing google cookies when you test serps.
It's not hard to track this stuff, the data is very simple, cookie id + url + search, it's fairly trivial I would guess, session tracking isn't exactly the hardest thing in the world to do, especially with the server resources google has.
The question is not why would google do this, it's why would they not do it? Yahoo and MSN do it, yahoo always tracks with hard coded redirects, no exception, can't remember how msn does it, but they all do it. Google just does it in a more subtle way, they even do custom browser based tracking. As with most stuff in search, google does it better than their competitors.
|why would they not do it? |
I am positive they use traffic patterns as part of their ranking algo, but since you asked...
While at the Meet the Engineers in New Orleans I asked several questions about this and was told that Google couldn't use click data directly because of a previous patent, but they were using the data in some other fashion (which he wouldn't explain) and they were working on more ways to use the data.
I don't have the time to look it up, but does anyone know who holds the previous patent?
I would guess that a very careful re-reading of the jagger patent application, which talked about tracking user behavior, will tell you just how they are looking at working around the other patent. And finding the other patent would probably shed a little more light on the jagger stuff.
Clearly google would not have included the tracking component in the patent application if they hadn't felt that they had found a way to work around the patent the engineer refered to.
Good information, this has to be one of the very first times I've gotten a real answer to what was intended as a rhetoricial question here, thanks!
lammert you may be onto something.
I've wondered for several years why every now and again I get a 'traffic spike' for a day or so. Never connected it with the possibility of being released from the sandbox for a while.
|Google gives Dell billion bucks...Google now not only knows what sites a user visits |
Uh, I know this is all about Google, but hasn't anyone thought of the amount of data Microsoft has--after all, they happen to have an operating system that's installed on a heck of a lot of PCs. MSN doesn't need a toolbar to collect the data.
Google is definitely collecting data--and with the latest toolbar update they're probably collecting even more data (considering that when you turn on the PR option they give you an additional popup about privacy).
|Having the Toolbar placed in 100 million consumer PCs would allow Google to determine site popularity |
I think there's other additional ways for them to collect the data, including just watching which sites are visited more via their SERPs. And why has no one mentioned the infamous "google cookie"?
Why would they collect the data without the intent to use it?
With all those datacenters, surely they use some for testing, and probably rotate weighted factors using feedback from the toolbar to test them.
Didn't we used to call this "temporal clicking" as was used by direct hit and hotbot?
An iterative loop based on the users. I call it brilliant, as a check on the foundation they have already engineered. Its a heck of a number crunch, but they have the power to do it. Eventually, it may even be done close to real time.
IMHO, this will not supplant PageRank, or linkpop, but will overlay with the vote of the searching public - who by all means should have important input. Not primarily other sites and webmasters as was the case with previous algos.
Once again, it still comes down to relevance and content. Google will use the searching public to weed out less than stellar sites. (Language ambiguity and inept searches will need to be part of the "fudge" factor. I imagine a huge amount of sites caught in the crossfire on this.)
And once again, commercial sites will have to provide content enough for the information seekers to keep them on the site to avoid triggering "irrelevance" penalties from those not interested in purchasing.
AND ...once again... Brett's concise two words show wisdom and insight.
"Florida update" :)
|1. How many people went to that site |
2. How long did people spend on that site (or page)
Without #2, #1 is meaningless.
And I would add at least #3 - How many went back to that site within xx days.
#3 and #2, in my mind, is the most important measure of a website's relevancy.
#1 is a measure of how well it is presently doing in the serps (or how well they are marketing across the internet), which, in many cases, is not a measure of quality/relevance.
If google focused weight on #2 and #3, I can't think of a downside, except for webmasters who spam and scrape - these guys would have to think about actually putting some effort into their sites, beyond SEO.
Do links being emailed via gmail get noticed in any way? People sharing a link could help determine quality, too.
So - if the algo uses data from adsense ads, does this mean that it is no longer a good idea to wait a while with putting ads on new websites, but to get them on and collecting visit data as soon as possible?
One thing that's been hard to ignore lately is the rise and rise of social bookmarking. These sites rely entirely on user-provided data, and seem on the face of it easy to game. Yet the sheer volume of data from different people means that they produce surprisingly relevant results.
With this in mind, Google's move towards increasing use of traffic patterns and data supplied by consumers seems inevitable.
If this is true then sites with forums will get a boost as users tend to spend a bit of time perusing them. For blackhats, I suppose it would be posible to fake or perhaps steal a large amount of content from other forums to fill one up.
It would be entirely posible for spyware to sneak onto a machine unoticed and for it to send out traffic that appears to be surfing the web when in fact the machine looks and behaves totally normally to the user. For spyware makers there is an advantage in that it does not have to interupt the user to do its job. In fact, the more unoticable it is, the better it is.
I see a future with Botnets for lease. Have your site surfed from up to 100,000 real users machines, at random times and your site nicely intermixed with the users natural surfing patterns.
I had another idea to exploit this.
Small ISPs could be persuaded (with money) to let you observe their traffic with a machine that you provide. Over time you could build a large number of real google cookies (or whatever identifier they use) matched with IP addresses and then use this info to spoof surfing activity to your own sites. Once again the spoofing is mixed with the normal surfers activities making it hard to detect.
There would be no impact to users machines and so nothing ilegal would be occuring. The only cost to the ISP is a trivial amount of extra traffic.
"links = votes
like in a democratic country. there is no better pattern. at least no one found it out, yet."
Vote buying is rampant in many countries. Influence is bought by lobbyists .. has nothing at all to do with democracy ...
Link buying is rampant in cyberspace. It's all about money and getting more so.
However, vote buying is impossible in a dictatorship. Hmm ..
|1. How many people went to that site |
2. How long did people spend on that site (or page)
Is there any relationship between ranking in the top 10 on matching search word & phrases and time on site?
People using search engines tend to spend less time on sites as they search around compared people going to a site with a direct link like a bookmark, email or newsletter or a link from a related site.
So I would think the pages ranked highly would get shorter visits. Am I was off base here?
What I'm leading up to is that pages at the top in the serps may actually lose if the serps are based on time on the site.
|This will make it virtually impossible for the little guy to succeed. The top 10 results will be based on the well established sites. |
- Google can (and has, in my opinion) throw lower-ranked listings up towards the top periodically to test whether they are attractive to visitors. This gives little guys a shot at proving they have something relevant that Google users want. (Increases the importance of crafting your SERP listing well -- just like AdWords advertisers get big rewards for crafting their ads so they appeal to users).
- The keyword space is huge. Well-established sites tend to "pig-out" on relatively few keywords that are high-traffic. Pick any hyper-competitive term, and a little guy can still chip away at dozens of 4-word related terms with relative ease. Google always has and always will rely on the "long tail" for their success -- that means they remain sensitive to giving the little guy a shot at some traffic.
- The well-established sites often are structurally unable to go after the long tail -- or even the part of the tail that's just slightly less thick. The lions get the choice steak, but the vultures never go hungry -- and they never have to get into fights with lions.
A question nearer and dearer to me currently is: Does traffic influence Googlebot visitation frequency and, if so, to what degree?
I agree with a lot of what is being said about how Google uses Adsense data. Knida makes me want to go back to using some of the other advertising options, like I did before Adsense came along.
Might not be as much money, but maybe my sites will be around longer :-)
Traffic patterns lead to a learning system. It would give google better control over what sites are satisfying the needs of the surfers.
eg. Surfer searches for blue widgets.
Clicks on the first result. Doesnt find the info they are looking for. Goes back to Google and clicks on the second result. Finds the information they were looking for and doesn't continue looking through the rest of the results.
Thus Google knows that result 2 was a better match than site 1. If this trend continues they may switch results.
It is auxilary to the normal algo. In that the normal algo would provide a starting point and then google would "tweak" results based on the user interact.
This system will help remove spammy sites and increase serp quality overall.
If giving a shot at traffic to the little guy was true --
I know a niche site, very focused, and totally original nice content. Their visitors are currently all type-in or from MSN and yahoo. No adsense on the site yet.
For some 3 weeks in March, they were in SERPs. Its a new site, so rankings were nothing great but they were there. Around 300 visitors a day from Google in that period.
Then the pages vanished from Google, and only the homepage remained.
For me, for a niche and new site, the traffic from Google when their pages appeared in SERPS should have at least kept them somewhere in the SERPs, at least deep down in the index.
That doesn't seem to be happening.
One of my own new sites with again excellent content (its an experiment, but I wrote 200 articles for that) has been supplemental forever - once it escaped supp hell and traffic shot up. Back to supp, and no traffic.
That shot at traffic seems to happen - but with no impact for the future.
| This 107 message thread spans 4 pages: < < 107 ( 1  3 4 ) > > |