| 4:03 pm on Jan 8, 2012 (gmt 0)|
I expect that eventually Panda will become more dynamic. Since it's now a third essential leg of the ranking algorithm (relevance+connectedness+quality) having it only update manually every few weeks or worse doesn't sound like a Google-style long term solution to me.
The only areas of the algo that update slowly, as far as I can see, are things like automated taxonomy generation and n-gram identification. Those areas do not need rolling updates because they are global characteristics of the entire web and those don't change in a fast sweeping fashion.
So what we're seeing in some cases are reports that new pages on a Pandalyzed website can sometimes rank well, even though established pages stay demoted. I'm not sure if that means a rolling update, or just a less oppressive application of the Panda score.
| 4:16 pm on Jan 8, 2012 (gmt 0)|
And for me at least there are two tests: What happens after the next panda update (do these new pages get demoted) and what happens after creating awareness (links) to these pages; do they move up the serp ladder...
I'll tell you what... Google was looking for "quality"... at least this webmaster is putting a lot more time and energy into developing "quality" pages.
| 6:50 pm on Jan 8, 2012 (gmt 0)|
If you look at what Panda is doing - analyzing many more factors, each to a 'deeper' level of analysis - you start to realize what a staggering technical feat this is, even for Google. Even with an estimated few-hundred-thousand to one million servers and the largest fiber network, they can only do this analysis by taking a moment-in-time 'snapshot' of their database and performing these calculations.
That's my understanding of why it's run periodically - that, and so it can re-calibrate itself, that 'machine learning' process. To my mind, it's only a matter of time - optimizing the software, optimizing the network to run it - until it's run alongside, or integrated into the 'regular' algo.
| 10:09 pm on Jan 8, 2012 (gmt 0)|
What i sense is that each iteration to date, enables a range of thresholds, related to new content additions , and releasable keywords. Some keywords still remain "Pandalized" as a consequence of a site needing to go higher in it's quality score. If a site hasn't risen in the underlying Panda quality score, then these "so called" dynamic elements that rank newly produced pages will not rank.
It would appear that this threshold, and interim enablement goes beyond those individual new pages, and extends site wide.
I'm not sure if this was in existance before Nov 19, but I'm guessing it's a recent feature.
It will be interesting to see if the next iteration builds around this. It will also be interesting to see if such pages prop up an overal improving quality score, or get overidden and drop. But this is probably individual to each site.
My guess is it won't be this week, maybe later next week that another iteration will come through - but that's pure hunch based on previous patterns.
| 12:43 am on Jan 9, 2012 (gmt 0)|
Looking at the crawl rate over the past few days... something tells me another update is going to happen very soon - wouldn't surprise me if it's sometime this week.
| 12:55 am on Jan 9, 2012 (gmt 0)|
|...a third essential leg of the ranking algorithm (relevance+connectedness+quality)... |
Excellent description. What I've been wondering as I've watched these changes and followed comments on them is whether all factors of the "quality" component, which we call Panda, are being revised with each iteration.
It may well be that they are not... ie, that Google is isolating some factors within "quality" and testing them separately, to keep a better handle on the changes.
| 1:28 am on Jan 9, 2012 (gmt 0)|
|What i sense is that each iteration to date, enables a range of thresholds, related to new content additions , and releasable keywords. |
That's worth consideration, Whitey. Remember when Amit Singhal commented about a later Panda iteration that it went "further into the long tail"? That seems to indicate a strong keyword-driver. "Releasable" keywords and "new keywords" could well play into the picture.
It might be worth it for anyone who is still struggling with Panda, even a little bit, to analyze their actual keyword traffic for patterns over time.
| 2:20 am on Jan 9, 2012 (gmt 0)|
|that Google is isolating some factors within "quality" and testing them separately |
It could be. We've long shared the view Panda would evolve from being in an infant state.
So potentially, what we might be seeing is certain elements being dropped from the main iteration, "so called" Panda updates , since they will update more or less dynamically, and gradually these Panda updates will become less significant in terms of the overall quality score management.
The speed of indexing of these new pages with new content that is added is the best I've ever seen on these sites.
As a further observation, i did also notice pages on sites , that i would term " more heavily pandalized " or in the early stages of recovery works, show very limited life. The response characteristics are mainly ; slow to index , impossible to rank new pages in isolation - but Google is abundantly aware of changes going on as they start to build.
@Tedster - did you take those comments by Amit Singhal to mean that this element that I'm describing was already in play ? I might be behind the times - but I've seen nothing mentioned elsewhere.
| 2:32 am on Jan 9, 2012 (gmt 0)|
No, I took it to mean that Panda was choosing which sites to evaluate according to what keywords they were ranking for. If a site wasn't ranking for keywords below a certain threshold of volume, then Panda didn't touch it. After a while, Panda got pushed a bit deeper into the total keyword universe.
But that was half a year ago - and we weren't noticing anything like rolling updates back then. The idea of certain keywords being "released" from Panda consideration is a new idea to me.
[edited by: tedster at 4:13 am (utc) on Jan 9, 2012]
| 2:47 am on Jan 9, 2012 (gmt 0)|
By "connectedness" do you mean social / linking /brand awareness etc?
| 2:48 am on Jan 9, 2012 (gmt 0)|
@tedster or it was targeting only specific niches
| 4:20 am on Jan 9, 2012 (gmt 0)|
I thought about that, Donna - especially since Google does generate website taxonomies. It did seem they might have targeted certain taxonomies and not others.
However, there probably isn't a lot of practical difference - big traffic sites were going to get checked out either way. And since "content farms" were at least part of Google's motivation, they would go after all kinds of information, and not just ecommerce.
In fact, the early Panda comments from Google were often about how trusted a site's information might be - lots of medical examples, for instance. So coupling that with the explicit mention of "long tail" I lean pretty much toward query volume as the selector for Panda evaluations. From what I see, even heavily Pandalyzed sites can still get rankings and traffic for extremely low volume or long-ish queries.
| 5:17 am on Jan 9, 2012 (gmt 0)|
I have another gut feeling as the main target for panda has been an index of already flagged websites and not the entire web as general. Its just a theory but it might make a sense.
For example a website has been doing well but had a flag for some OOP or shady BL profile and its ranking factor has already been diminished by a small amount but not so noticeable. This website is flagged and it is lets say in the yellow zone for the sake of calling it zone(green->yellow->red), or trust.
So this index of websites in the yellow-red zones are the ones that have been continuously ran over and over again and once you get in the zone you are stuck there.
It just looks and feels like it, notice most of the top 20 for a specific terms, 2-3 websites will be bouncing in and out on panda iterations , but the top dogs even if poorly managed will stay as they have not done anything in ages and probably have not done anything to trigger any flag, even if their content is crap and its more than obvious not calculated between each other.
Panda just does not threat/rank every website equally that's the reality. I was thinking about authority but it still makes no sense , it has to be some flag which guides the algo. Think about it for a sec, if panda was run as they claimed why would there be only like 11.8% affected queries ?!?! and that was the main introductory update. The entire web should have been shuffled upside down but no, maybe during their internal testings they saw the real mess so they decided to affect only certain types of websites.
This is just a theory but it boggles my mind.
| 6:21 am on Jan 9, 2012 (gmt 0)|
Thanks for that idea, Donna - I'll put it in my stew pot and try to decide if it belongs in the recipe long-term ;)
The theory would be a wobbly trust history could create a prior weakening that makes a site more susceptible to Panda - have I got that right?
| 6:42 am on Jan 9, 2012 (gmt 0)|
@tedster correct, it should be called the "Wobbly Trust Filter" aka WTF penalty :) Ingenious !
| 7:23 am on Jan 9, 2012 (gmt 0)|
|toward query volume as the selector for Panda evaluations. |
Makes sense ... and by query volumes, by default, the competitive nature of a niche will come into play.
Probably sites in low search volume niches will have a much easier time than large sites in volume markets to break through keyword thresholds, if they exist.
| 9:25 am on Jan 9, 2012 (gmt 0)|
|if panda was run as they claimed why would there be only like 11.8% affected queries ?!?! and that was the main introductory update. The entire web should have been shuffled upside down but no |
That's a good point, but the same result would also be explained if it was done by keyword, starting with the highest volume. They never said which searches those were, but judging by the carnage and also the intent, you would expect them to be the high-volume ones.
Anyway, it's clear just from the idea that you can predict what percentage of queries are affected that it's not being run over the entire web as part of the general algo, which as you say would affect all queries at least to some extent.
| 1:48 pm on Jan 9, 2012 (gmt 0)|
I think Donna's theory makes a lot of sense.
| 2:36 pm on Jan 9, 2012 (gmt 0)|
It's been my observation that these 'panda runs' are actually adjustments to the parameters of the algorithm, reindex of content or very minor modifications.
In my experience the Panda Algorithm indexes content shortly after the older algorithms in Google have done their job and applies its results there after.
So I believe it's already a dynamic part of the algorithm that runs inline with the rest of Google's indexing.
| 2:38 pm on Jan 9, 2012 (gmt 0)|
|I have another gut feeling as the main target for panda has been an index of already flagged websites and not the entire web as general. Its just a theory but it might make a sense. |
I think this theory has legs - it certainly fits in with my experience. Old site + forum. The content (non forum stuff) had a selection of fairly poorly written articles, but it all historically ranked really well. It was an old static HTML site that I never got round to upgrading so no new content was added in years (since maybe 2006). However, that was only 250-300 pages out of 40,000 (the rest were forum pages, which was pretty active).
Site got hit by the first international Panda roll out - organic referrals down 80% (traffic to the static content) but the forum referrals remained stable.
Not complaining about the traffic loss - the site was badly designed and poorly maintained so it's fair game. But I think it falls within the remit of you theory in that it was variably useful (some good content from expert authors, IBLs from the likes of Harvard Business School, BBC, and active forum) but also low quality signals (poor design, no new main content, some badly written articles, no contact info, some dodgy IBLs, etc). I fully accept that it probably raised a few flags due to high rankings for competitive terms but I would expect the realistically the design and selection of poor content caused the penalty.
I've just moved the content over to Wordpress, redesigned the entire site and ditched the forums (for the time being - they were too busy for me to maintain). I removed the crap articles (which were basically old school keyword rich SEO copy) and have been adding new (IMO reasonably well written) articles daily. 301'd all the redundant content to new versions or root.
I think the move to ditch the forum could have been a bad one, but from a business point of view it makes sense right now. I'm more interested to see how moving the content from the blatantly poorly designed HTML site to a professionally design WP theme site will make a change. Also pushing Twitter presence for the site and more effectively interlinked the content via various WP plugins.
New site only went live on 4 days ago - new URLs still haven't been indexed (other than new posts).
Will post back with results, but I'm reasonably confident that I addressed the majority of the issues that could have potentially caused the site to be hit by Panda. Whether or not a subsequent update will reflect this or not is another matter.
The site was relatively high traffic - 150k>200k visitors per month (now 30k>50k). Hoping to see at least some of a recovery.
| 11:23 pm on Jan 9, 2012 (gmt 0)|
@Scott, I'm very interested in hearing your results! Been a lot of people doing similar things. I witnessed a site go from badly mangled by panda to recovery back to their pre-panda levels with the conversion to a new domain name... design etc.
@Tedster, I thought the same thing about the constant referral to "medical terms" and "medical trust signals"... my skeptical mind considered it a PR move... ie. less people will listen to the importance/impact of the change if we talk about hobbies... if we talk about "will you trust medical advice..." and we (Google) are saying we providing a "better" SERP as a result of the changes... then we "google" look more important to humanity.
@Donna - I like your theory, how does that relate to a site like mine that has thousands of indexed pages... and post panda (after losing 70% of our traffic) we still have some fairly decent key terms ranking in the number 1 position? Wouldn't the site be stripped of all top of the SERP rankings if it is in the yellow/red zone?
| 11:40 pm on Jan 9, 2012 (gmt 0)|
@Lenny you are a good example how this works, you had/have an Over Optimization Penalty, so you get bounced around on each panda. Probably once you fix that Panda will not disturb you on other iterations.
| 2:56 am on Jan 10, 2012 (gmt 0)|
@Donna, I see so if you are in the zone... like if you have an OOP penalty of some sorts... that before didn't really hurt you, you are now a Panda Candidate. If you have a clean site that is out of the Zone, panda doesn't apply to you.
| 3:59 am on Jan 10, 2012 (gmt 0)|
We would probably never know but thats one of the ideas
| 7:50 pm on Jan 10, 2012 (gmt 0)|
Just the idea of OOP is kind of ridiculous, hypocritical, and SCARY IMO. They have an entire department dedicated to helping webmasters "optimize" their websites in tons of different ways, but if you do "certain things" a "certain way" (both of which are UNKNOWN and remain UNKNOWN for the most part, not speaking blatant BH tactics here) Google now apparently gauges your "intent" and passes judgement that affects your site's POTENTIAL long-term. This is all very slippery to me as it certainly strays away from Google trying to find the most useful END RESULTS in the SERPs(ie the relative BEST content)
Instead, now the focus/filter seems like it initially judges the webmaster's intent,...so we shouldn't try to produce the best webpage for a specific search phrase because the search volume threshold is high for that phrase and therefore the market is competitive, and therefore we might be "over" optimizing in order to EVENTUALLY get there? Riiiight. That is just frightening to me.
What about the fact that your competitors can build garbage anchored links all day long to your websites now? So, if you have the relative BEST content/webpage for your niche, but you've used the search phrases as your page title and h1 tags, and you have larger amounts of incoming anchored links AND some additional internal anchored links on your homepages...you MUST no longer have the RELATIVE best/most useful webpage then? Because you COULD be manipulating search results...but isn't that basic onpage optimization as well? I genuinely don't understand...
I 100% agree that it exists BTW...just complaining lol.
| 9:58 pm on Jan 10, 2012 (gmt 0)|
@jsherloc > Well said... and good points. I've felt it... but, never had the clarity of mind to put it in words like that.
Hopefully for Google's sake and for everyone else; the threshold to get INTO the mess is higher than a reasonable doubt... ie. you really have to be gaming the system to get caught up....
| 9:30 pm on Jan 12, 2012 (gmt 0)|
I've never seen this before on the Pandalized sites i monitor. New content showing within 1 hour across a range of sites, visible on the results preview image and on exact match.
The cache, site:tool and WMT reporting tool badly lag.
In Nov and before these sites could take 2-6 weeks to reindex.