| This 105 message thread spans 4 pages: 105 (  2 3 4 ) > > || |
|A Simple Theory about Panda|
Here is a simple theory about Panda:
-- Panda evaluates the "quality" of a website mainly by analyzing user-behavior.
-- By user-behavior, I DO NOT MEAN BOUNCE RATE. Instead, I'm referring to more reliable indicators of quality, such as
----- User bookmarks a page as a favorite.
----- User saves a copy of the page on their hard drive.
----- User prints out a copy of the page.
----- User returns to the same page later.
-- Google mainly uses the Chrome browser to collect this data on user behavior. Tens of millions of people now use Chrome as their main browser. This is enough to allow Google to collect statistically meaningful data. And Chrome enables Google to collect data for ANY WEBSITE.
-- In order to evaluate a site statistically, Panda needs a minimum number of user-behavior data points. Thus, the data must be collected over a period of time. As new data is collected, the oldest data can be discarded, but enough must be kept to enable a meaningful evaluation.
-- At the present time, for some sites Panda could still be using data that was collected as far back as last year, because it is still needed for a statistically-meaningful evaluation. This could explain why people who made big changes to Panda-affected sites still haven't seen any major ranking improvements.
im not sure how this would explain the type of sites that got a boost in many cases. Scraper rubbish. How would they ever get a boost under that system?
Maybe it's just a result of the top sites being removed - and then sites that were too "long tail" would have escaped the evaluation net, so they get a side benefit of moving up, rather than getting an intentional boost.
Is anyone watching some of these high ranking junk pages over time - do we know if they retain their ranking or disappear after a relatively short period?
...so according to this theory if i block chrome visitor i can make a site "blind" for Google and G. has to rely on "pre-Panda" signals?
user behavior is a very small factor. user behavior is something that can be faked so it will never carry much weight. Yes google can see certain trends and signals, but each website topic is different so user behavior will vary greatly depending on the subject.
Google has a lot to go on when judging a site's quality. They always like to look for certain trends that quality websites follow. It is always best practice to model your own sites after high quality, highly reputable websites. For example, here are some signals I would like to see if I were building a search engine and I was evaluating a sites quality:
- Is there a contact us page or a phone number to reach this company?
- Was this site just slapped together overnight and then neglected?
- Is the information on this site outdated?
- Is there an faq / help section?
- Is the site easy to navigate? Is the content organized or is it hard to read (small text, intrusive ads etc)?
These are just a very few. Think about if you were building a search engine, what you would like to see about a website that would make you feel like it would be useful to visitors.
Yes, chrome has a large enough stake in the browser market, but how many people really use the site block extension? I remember it took me a long time just to find the extension so I imagine a very small percentage of chrome users actually use it, and I would be that mostly webmasters / very internet savvy people are using it.
|user behavior is something that can be faked so it will never carry much weight |
Perhaps to a degree - on sites that have a fewer daily visitors. But if you have enough data points over a period of time, it will be very difficult to fake it and still keep a natural visitors' graph.
I am not saying google doesnt use behavior factors...I am just pointing out they play a very small role in ranking calculations.
I suspect the Chrome browser now plays a very big role in the rankings we are now seeing, I don't know much about Android but I wonder if they are using phones to collect data as well. On such a large scale I expect there are some very clear signals that come from quality websites that would be very hard to game.
Brinked, user factors have always played a small role, yes. But they don't now. They play a much larger role today than they did a few months ago. THAT is why Panda is so different and THAT is why there is so much collateral damage.
Aristotle I think history will prove you to be absolutely correct about this. It is exactly what I have been saying and writing about for weeks, and explains so much. As someone already mentioned for Santapaws, it even explains how crappy sites and scraper sites got a boost. The competition was removed because they had enough traffic to generate the data. The scraper sites and crap sites that only got a few hundred visitors a day weren't producing enough data to make much statistical significance. But all of the people who are celebrating their new-found traffic for their scraper sites are going to be in for a bruising the next time Panda runs with the new data showing how much people hate their sites.
This is all just a hypothesis based on strong evidence and educated guesses. Is it likely to be 100% actuate? No way. But I think history will show it to be the closest we're going to get to explaining all of this.
|But all of the people who are celebrating their new-found traffic for their scraper sites are going to be in for a bruising the next time Panda runs with the new data showing how much people hate their sites. |
what's to stop them just making a new site, and repeating every 6 months.
Londrum, because all of the high quality, high traffic sites that have worked to fix their quality issues won't be easily knocked off the top of the hill again by low-quality and scraper sites. A lot of otherwise "good" sites were totally blindsided by this.
But I don't think Google anticipated how widespread the collateral damanage, the rise of scraper sites and otherwise crappy websites was going to be - nor how long it would take to gather the data necessary to run this again and fix the problem. I can't pretend to know what's going on in the Googleplex, but if they KNEW what was going to happen and flipped the switch anyway, they deserve to lose their jobs IMO.
|Brinked, user factors have always played a small role, yes. But they don't now. They play a much larger role today than they did a few months ago. THAT is why Panda is so different and THAT is why there is so much collateral damage. |
Can you please provide some sort of evidence that supports this?
To my knowledge, the only recent mention of user behavior factor is coming from googles chrome extension which allows users to block sites from appearing in there SERP's. Cutts stated this is even only used in high confidence situations.
User behaviour is much more than just using the Chrome extension to block the sites. With Chrome they have all sorts of new signals - saving the page to disk, adding it to favourites, type in URLs, time on page, time on site.
And it is not just Google Chrome where Google can collect the data. CTR from organic SERPs from any browser (weighed by a relative SERPs position of the listing), quick clickbacks to SERPs, site preview on SERPs (and then clicking or not clicking to listing), etc.
And even if we discount Chrome and also discount Google Analytics (lets say we trust G. when they say they do not use it for signals), there is Google toolbar, Google Translate, Google maps, whatever other Google widgets/api sites embed to their pages and this is just a tip of an iceberg. All these *could* count.
I do believe Google used some of user behaviour signals even before Panda, but I think they have dialed up these signals now.
I am well aware of all the possible signals they can use as ranking factors in regards to user behavior. I am asking for any reference that supports this theory that google has added user behavior factors into its latest algo updates. Since every site is different, behavior from visitors will be different. I have never heard google releasing any statements about this. This is purely speculation and we can all speculate until we are blue in the face but that does not make it true.
There has to be a reason why you are theorizing user behavior as being a bigger factor than before, I am simply asking for your reasoning behind this theory. Right now it sounds like many are grasping at straws. Its almost as if we are saying "well I do not see anything wrong with my content so it must be because of how my visitors are behaving on my site".
One more thing I should have mentioned: Because the websites that were hit by Panda have lost a lot of traffic, it will take longer for Google to collect new data on user behavior to replace the old data. This means that it will take longer for the new data to replace the old data and overcome its effect.
I can give you my reasons. First, user data is a great place to look for signs of quality. But even more, starting last fall Matt Cutts made several comments (including a video) where he adviwsed webmasters not to chase the algorithm but to chase their users instead.
Panda was in development for more than a year - by last fall Google already knew where they were headed.
[edited by: tedster at 8:06 pm (utc) on May 30, 2011]
Panda 2.0 and 2.1 had a major user and possible social component at content's expense. Even sites that had gained on Panda lost majorly 50%-70%, after those updates.
The winners? A few major sites that sucked traffic from 100+ others. Did they deserve to gain all that traffic? Absolutely not. But the metrics Google used made that happen, and content didn't hold as much weight.
Edit: So it all depends how Google uses it and what weight it gives. We already see certain political sites outranking everyone, including NYT and Wash Post, simply because they have a vocal and loyal membership.
|The winners? A few major sites that sucked traffic from 100+ others. Did they deserve to gain all that traffic? Absolutely not. But the metrics Google used made that happen, and content didn't hold as much weight. |
Are you privy to information the rest of us are not regarding which sites gained and which lost? Or is this statement based on anectdotal evidence?
Nope, I checked my niche on compete and quantcast and then read the complaints of the small fish. i posted about this a week or so ago.
Bottom line: it was /is based on 'site' not on individual pages.
[edited by: walkman at 8:33 pm (utc) on May 30, 2011]
I didn't see panda really affect my niche. But I'm in a pretty crappy niche. No real decent user behavior. Does that mean anything?
So you're just talking about one tiny slice of the internet then?
"So you're just talking about one tiny slice of the internet then?"
Yes, I checked about a dozen sites. I don't have the time to check, say millions, since it was done manually.
|I can give you my reasons. First, user data is a great place to look for signs of quality. But even more, starting last fall Matt Cutts made several comments (including a video) where he adviwsed webmasters not to chase the algorithm but to chase their users instead. |
Google has always said to build your site for your users, not for google. This is nothing new, this is what google has been saying for years.
User data is a great way to judge a websites usefulness I just truly do not believe it is a major factor right now. My brother in law's site who I help maintain for him was hit by panda, his bounce rate is 12% with an average time spent on site of over 8 minutes and an average of 6 pages per visitor. Those are crazy good numbers that would suggest this site is very useful and of very high quality, but yet it was hit by panda. He is a MMA celebrity and now doesn't even rank number 1 for his own name, when people google his name I am certain they are looking for his personal site and not his bio which was scraped from his site.
User behavior as well as social media do play a part, and these signals will likely hold more weight in the future, but I certainly do not see them playing a major role in this present day in time.
Reading this and other "it's about user feedback" theories reminds me of watching Tomorrow's World (a future-gazing British television series that ran for years). The trouble with that programme was they always over-estimated how much things would change. They were describing the idealistic outcome, not the realistic one. They'd gloss over the gaps, disregard the awkward reality and go for the grand idea. Of course, in the real world, things are more often than not the same, or just a little bit different in the future. Why? Because change is complicated!
In his opening gambit, Aristotle describes several behaviours that many people (including myself), rarely do; I can't remember the last time I saved a webpage to my harddrive or printed it out?! As for bookmarks, yes I do indulge (but only for information searches), whereas my wife (who uses the internet all day, every day) has probably had the same ten for the last five years!
Throw into the mix that Chrome has a small and skewed user base and any statistician worth his salt would be compelled to largely ignore the "data" provided.
Beyond back-button bounce and search again, this idea just isn't credible.
brinked - your brother in-law's site is a classic example of how wrong Panda is. There is absolutely no one in their right mind who think that site should not rank #1 for his name.
It is possible they use Chrome, and statistically it might even be accurate. I remember being taught in market research about TV ratings, less than about 0.01% of viewers have a box to send feedback of who is watching what. But it is the standard deviation that mattered, and was therefore statistically accurate enough.
Therefore I am sure that Chrome has a large enough share to gather such information.
In light of this, I do not believe they are using this data to rank sites. I see no evidence of it, at least from my research. Or if they are, it makes such a minor difference it doesn't really matter.
We must assume that google is using user behavior. This is something we will never truly understand as this is something that is very complex to study.
There really is nothing you can do to prepare for this. If google uses this as an indicator then the only thing a webmaster can do is prepare for it. Build a high quality site that your users love. The best thing one can do is allow your visitors to offer feedback. You will be surprised all the different feedback you receive that you never even considered. When we build a site, we build what we think is important and relevant to us, but you cant build a site for yourself, you need to build it for your audience, unless you're ok with being the only traffic to your site.
same in my Eagle comic - everyone had a jet car !
|reminds me of watching Tomorrow's World (a future-gazing British television series that ran for years). |
But .. Chrome has a 16-18% market share - unlike the tv sampling boxes the users are changing all the time - so there is no control over the demographics like the audience sampling
With 82% not using Chrome it means, if indeed the data from Chrome is used at all, any Chrome data will be a much smaller element of the calculation.
As Google can only know data from their networks there is no way they can know, for certain, a sites effect on/popularity users. Unless they collect data from comms companies ( LOL )
Given that you are down to site linking - internal and external, how the bots read the page, feedback from Adsense/Adwords, google analytics, data from api's like twitter , hosted blogs etc
If it wasn't you wouldn't see scraper sites and "boiler-plate" sites so high as I wouldn't say they were popular with anyone.
|I can't remember the last time I saved a webpage to my harddrive or printed it out?! |
I have statcounter installed on one of sites I am looking after. It is not a big site (approx. 3000 pages) and it gets around 1000 visits/day. I do look how visitors navigate the site and where did they come from and I have seen quite a bit of pages with c:\Desktop\... etc which shows me the user has saved the page to disk, but statcounter code still kicked in from such saved page.
We are not average searchers nor we navigate a site in the same way as Joe Blogs the Average. And there are thousands of Joe Blogs to each one of us. I think that assuming the average users searches web and navigates sites based from what we do can lead to a wrong path.
As for Chrome, 16% is a very good sample if the distribution cut across all types of users proportionally. We should not assume what kind of users install Chrome (here I am not sure if it was meant more or less technically savvy), but my father in law who is a "techie virgin" installed it because the button was there and he was told it is faster, so he just clicked on it, and there will be many more like him.
I never said you were wrong. I took is as you have read something I missed. I was referring more to the user behavior factors and I think you are talking more about social media factors.
They do relate to each other and google has said they do use social media signals and I do not question that. What I am referring to are on site user behavior signals such as time speent on the site, saving the site to your hard drive, bookmarking it etc.
There are a lot of good arguments made here. Sometimes we get a little complicated in our theories when its sometimes best to just keep it simple.
Much appreciated. I'm not sure if I read or saw something you missed or not. The reasons I have for thinking what I think are there. Maybe you missed something; maybe I just interpreted them differently than you do. Either way, I hope Google gets their act together soon because it's getting just plain ridiculous out there.
I don't think it's about just "social media" factors though. I do think that they are focusing more than ever on other user feedback signals, like time spent on site, click backs, bookmarks and site blocking, among many, many other things.
When has a major Google algo update been about a ton of different issues that were totally unrelated? One might have focused on paid links. Another might have focused on giving more importance to Brands, or less importance to EMDs. One might focus on back-link relevance and another on keyword stuffing. Everyone assumed that this one was focusing on content farms, but Google didn't say that (or at least not that I've read or heard) and we all wonder why they didn't kill off eHow.
Google has already told us that this update incorporates user feedback signals. It's right there on their blog: [googlewebmastercentral.blogspot.com...]
Yes, they refer to just one specific type of user feedback. But look at the history of major algo changes and you'll see that they tend to focus on one issue, or group of related issues, at a time:
Which leads me to believe that if they have updated the way they use one user-feedback signal, chances are that other user feedback signals played a roll in this update.
I know you didn't say I was "wrong" but it would actually be fine with me if you said "I think you're wrong" or "I disagree". Maybe I AM wrong. I have been before. Many times. Just ask my wife. But this is what I think and I've said why I think it. That's about all I can say on the subject until something else happens.
| This 105 message thread spans 4 pages: 105 (  2 3 4 ) > > |