Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Why Haven't Sites Come Back from Panda? Matt Cutts Tries to Explain

         

walkman

6:49 am on Jun 8, 2011 (gmt 0)



This is a rush(?) transcript from Dany Sullivan's blog so probably not everything is 100% correct. The italics and bolding are mine.
[searchengineland.com...]
DS: Talking about Panda, says that he’s getting a ton of emails from people who say that scraper sites are now outranking them after Panda.

MC: A guy on my team working on that issue. A change has been approved that should help with that issue. We’re continuing to iterate on Panda. The algorithm change originated in search quality, not the web spam team.
....
DS: Has it changed enough that some people have recovered? Or is it too soon?

MC: The general rule is to push stuff out and then find additional signals to help differentiate on the spectrum. We haven’t done any pushes that would directly pull things back. We have recomputed data that might have impacted some sites. There’s one change that might affect sites and pull things back.

DS: You guys made this post with 22 questions, but it sounds like you’re saying even if you’ve done that, it wouldn’t have helped yet?

MC: It could help as we recompute data. Matt goes on to say that Panda 2.2 has been approved but hasn’t rolled out yet.

DS: Reads an audience question – is site usability being considered as more of a factor?

MC: Panda isn’t directly targeted at usability, but it’s a key part of making a site that people like. Pay attention to it because it’s a good practice, not because Google says so.

Matt mentions 'pull back' but that's nonsense and very disingenuous of him. Pull back to me means letting a previously labeled bad content rank. We're talking about improved sites and content, no need to pull back, just reanalyze it.

So it's clear to me that this is a penalty. Maybe if you got links from every newspaper in the Northern Hemisphere you might escape but for the rest it looks like it depends on Google engineers. It took them 3+ months to admit it.

netmeg

1:04 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(there's no litmus test for senior members - otherwise I wouldn't be one)

But yea. I'm pretty tired of opinions being presented as "clearly" facts as well.

tangor

1:29 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All the fixes mean nothing until one knows what to fix! One or two things that MC said early on made sense (thin, content farm) but all the REST which happened to many webmasters everywhere, did not, does not, and will not make sense until we are (not likely) told what new parameters were incorporated with Panda x.x.

Meanwhile, diversify!

[edited by: tangor at 1:54 am (utc) on Jun 9, 2011]

Leosghost

1:29 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You are not alone in that feeling netmeg, not by any means.

AlyssaS

2:04 am on Jun 9, 2011 (gmt 0)

10+ Year Member



All the fixes mean nothing until one knows what to fix!


Well one big clue came from Google on their live webmaster chat - see this article for the report:

[searchengineland.com...]

And the interesting bit was this (it's a response to someone asking why it has not been rolled out internationally):

“There were some characteristics that were more applicable to English-language sites,” Cutts said. The original question came from a viewer in Poland, and Cutts explained that “the link structure of websites in Poland is a lot different” from the link structure of sites in other countries.


Panda seems to me to be entirely about on-page stuff. Therefore "link structure" is referring to internal links. Go and look at some of the Pandalised sites - they have a really heavy link structure.

I don't think it's anything as simple as internal links are bad - more like they are applying ratios of some sort and some sites have tipped onto the wrong side of Panda.

I've been banging on about Hubpages v Squidoo since late Feb [webmasterworld.com] - but though these sites are very similar (both have spam, both user generated content, both have a lot of ads) - the one difference is that Squidoo has a light internal link structure and Hubpages is really really aggressive with their internal linking.

But sorting it out isn't simple. Suppose you remove those links - the internal links might have been the only things supporting some pages, possibly a lot of pages. So you might come back a bit, but not really back to where you were before Panda where your internal link structure was supporting everything. You are going to have to get new external links to those pages which have lost internal link support, or resign yourself to not getting traffic back for those pages at all.

For smaller sites - I think someone reported getting their rankings back simply by removing links to archive pages. And I've heard some reports of people regaining rankings by removing excessive tags. It's possible those archive pages and tags didn't do much anyway, so no loss in removing links to them. But for more complicated sites, it might be another matter altogether.

And of course there are other elements to this algo as well, such as ad-to-content ratio, duplicate content and so on. If you get a black mark against too many elements, you won't recover till you've removed them all.

tedster

2:12 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



did not, does not, and will not make sense until we are (not likely) told what new parameters were incorporated with Panda x.x.

Here's how I see it.

The Panda algorithm is based on machine-learning. This means it's a predictive algorithm, assembled by an automated process. It's predictive because it works from a "seed set" that was generated by human judgment. The machine learning program is let loose across a huge pile of factors to discover what data might predict "shallow quality", as defined by the seed set. The prediction will not be accurate in the case of every website, but as the process iterates it does become more and more accurate.

When the machine predictions look good and their results pass some human QA, then those factors it identified, however they are weighted and combined, become the algorithm. This stays in place until that entire process can be re-run and generate a new version of the algorithm, incorporating new factors. As I understand it, that's the "running the data" part of Matt's comments.

The full list of parameters at any one time is likely to be even more confusing to the general public than the current situation is. And for a select few people, that list would open the door to gaming the algorithm.

So I'd say you're right - we're not going to get the recipe. If we did, we might be astounded at some of the data that Google is maintaining. I'm sure there a lot more than we've ever guessed, and much that they collect but have never used before.

supercyberbob

2:37 am on Jun 9, 2011 (gmt 0)

10+ Year Member



Who do I have to pay to get this list of parameters?

tangor

2:41 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Who do I have to pay to get this list of parameters?


In the not-so-tinfoil-hat world of biz (and advertising) you might have uncovered the REAL direction of Panda... a pay to play kind of serp... (I know, I rant this from time to time... but that's the only thing that makes SENSE!)

Follow the money...

AlyssaS

3:05 am on Jun 9, 2011 (gmt 0)

10+ Year Member



P.S. Small anecdote about a site that got taken out in Panda 2.1 and came back yesterday. It's a small site, one of my earliest, and I hadn't set my spam traps properly, so it got over-run in the comments (each post with about 160 comments of spam). I zapped them all and expected the site to come back when re-crawled, but nothing happened - till yesterday when the re-gen happened.

Why did it need a panda re-gen to happen rather than a simple re-crawl? All I can think of is a) perhaps panda is looking for certain types of phrases it associates with spam, and once you are marked down it needs a panda re-gen to lift the blackmark or b) the links in the comments triggered the down-grade - i.e. the bot looked at the html and saw too many a tags, it tripped a ratio and down I went. And then got restored on the re-run.

tedster

3:54 am on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



nothing happened - till yesterday when the re-gen happened.

But according to Matt, version 2.2 of Panda is not yet live - right? So something else must be in the mix.

How long ago did you dump all the spam - how many crawls of those pages, roughly at any rate?

AlyssaS

4:14 am on Jun 9, 2011 (gmt 0)

10+ Year Member



According to my log I found and zapped the spam on 19th May (the site had dropped in rankings on 9th may which is consistent with panda 2.1). It got crawled a few days later. And I kept re-checking the cache and the cache date kept getting updated.

I think there was a re-gen of sorts starting late on 6th Jun, it's consistent with the serps upheaval reported in many places. But of course I might be completely wrong and the delay in the site getting back was something else.

But zapping the spam was all I changed - I didn't add any content or amend the structure in any way.

plondon

10:11 am on Jun 9, 2011 (gmt 0)

10+ Year Member



Thanks guys, I found this thread extremely reassuring. I've only been an Adsense publisher for 9 months, and in that time I experienced a steady climb in traffic and income. Until early May when I lost about 2/3rds of both in a week. Through May I steadily climbed back to about 2/3rd of my April stats: beginning to average about £13-£15 per day...

Then something happened again in June:
Wed 1st: Page views: 488 - Est Earnings: £11.05 - CTR: 6.15%
Thu 2nd: Page views: 417 - Est Earnings: £16.74 - CTR: 8.63%
Fri 3rd: Page views: 389 - Est Earnings: £11.76 - CTR: 7.71%
Sat 4th: Page views: 308 - Est Earnings: £04.20 - CTR: 5.84%
Sun 5th: Page views: 337 - Est Earnings: £11.51 - CTR: 8.01%
Mon 6th: Page views: 493 - Est Earnings: £28.91 - CTR: 8.11%
Tue 7th: Page views: 378 - Est Earnings: £05.19 - CTR: 4.23%
Wed 8th: Page views: 315 - Est Earnings: £01.46 - CTR: 2.22%

Yesterday was my worst day since I began. Beating even the first week of May. But Monday was my best day ever in earnings, although my best traffic day was around 690 in late April.
(Saturdays always drop. But midweek is traditionally my best time.)

These stats come from about 10 performing sites, with between 10-50 pages on each. All domains about 7 months old. All original content. Mainly US audience.

All I've done since beginning of May is begin adding content more regularly, about every 5 days. And continued link building low-medium PR sites through commenting, and article marketing. Plus I upped my RSS feed distribution. Nothing particularly 'shadey' and certainly nothing that would raise any flags. These are small sites, small niches and I'm a one-man show working from home.

Anyone make any sense of this?
Why is the Panda picking on me?

rlange

2:34 pm on Jun 9, 2011 (gmt 0)

10+ Year Member



AlyssaS wrote:
I think there was a re-gen of sorts starting late on 6th Jun, it's consistent with the serps upheaval reported in many places. But of course I might be completely wrong and the delay in the site getting back was something else.

Something happened, definitely. My company's main site saw a significant increase in traffic starting Monday (June 6th).

The odd thing, though, is that no changes were made to the site and the new traffic is all going to a different section of the site than the traffic that was lost.

If I had made changes I'd agree that it was a simple re-run of the algorithm, but since I didn't change anything I can only think of three possibilities: 1) Google changed something on their end, 2) some external signal for our site changed dramatically, or 3) the algorithm can produce [significantly] different results each time it's run.

--
Ryan

HuskyPup

2:50 pm on Jun 9, 2011 (gmt 0)



Panda seems to me to be entirely about on-page stuff.


Insofar as my sites are concerned I am seeing the same literally from the titlebar, meta description and all the way down the page with h1, image alt and title etc.

I'm adding extra text content IF available but otherwise tightening-up all my own my SEO stuff and if it means generating a new page because the Panada does not understand that keyword1keyword2 for me is the same as keyword2keyword1, then so be it.

So far this is working well in the UK SERPs.

Planet13

3:21 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And continued link building low-medium PR sites through commenting, and article marketing.



Anyone make any sense of this?
Why is the Panda picking on me?


The general consensus is that the types of links you are building have seen reduced value, and will continue to see declining value from google as it moves forward.

Also, since these kinds of links are relatively easy to obtain, it is VERY likely that your competitors are all doing the same type of link building.

Planet13

3:32 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@ tedster

This stays in place until that entire process can be re-run and generate a new version of the algorithm, incorporating new factors.


But according to Matt, version 2.2 of Panda is not yet live - right? So something else must be in the mix.


I apologize if I misinterpreted your comments. I am a little confused by this, so maybe you can clarify a bit.

Are you saying that one's rankings are effectively "locked" until each new version of the algorithm is made (i.e., each time Panda is "run")?

Are you saying that sites can't have their rankings reassessed under Panda by simply being re-crawled / re-calculated under the CURRENT version of the Panda alg?

While I can understand your points about the learning nature of the alg requiring it to be re-run (so that the algorithm - not the rankings - can be adjusted), I would still thank that ranking adjustments would take place before it is re-run.

tedster

3:40 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am saying the FACTORS that the algorithm uses are locked in, not individual rankings. There's plenty of evidence that websites can change their rankings, even during the same version of Panda.

In the case that AlyssaS described, it sounds to me like some non-Panda part of the total algorithm was at work.

Shaddows

3:41 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Think of it in 3 stages.
1) Definition of criteria that is being considered. This consitutes the Panda version. It is defined by engineers. Example: Keyword density (AFAIK, KW density is NOT a Panda factor)

2) Current "Understanding" of criteria. This is machine learnt, iteratively. There are several iterations between version updates. It is defined by seed sets, and refined by world data. Example "Optimum KWD = 12%, ranking points allocated on a power curve" (12% is NOT a good KWD, and a power curve is NOT a good distribution curve for this data)

3) Application to crawl data. This is updated on the fly like regular data. Example "This site has a KWD of 15%, so gets 10 points"

Edit for clarification

freejung

4:10 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



3) Application to crawl data. This is updated on the fly like regular data.

That makes sense conceptually, but I'm not sure it explains what we're seeing. Under the normal algo (which is static until manually updated, presumably) if you make changes to your site that would impact your ranking, you see changes in ranking quickly (admittedly you don't normally see the full effect immediately, but you generally see some effect).

Now people are reporting no change in ranking even after making radical changes to their sites, changes that should have had some effect even under the old algo. Why? I don't think we've explained that.

Under the theory you propose, if you do nothing to your site your ranking will only change when the algo is re-gen'd. But your ranking would change, under your theory, if you make significant changes to your site that impact the factors Panda is measuring. If that's the case, why do so many people report that their Pandalized rankings are remarkably stable even after making radical changes?

Maybe we're just not changing the right things.

HuskyPup

4:23 pm on Jun 9, 2011 (gmt 0)



Maybe we're just not changing the right things.


From what I am seeing, yes, that would seem correct, I have been very clinical about my changes, I've analysed in-depth every page I've changed and whether I felt it would be better from an informational and SEO perspective, whether it would work or even be practical for other sites with thousands of pages I have no idea but it is working for a couple of my small sites right now.

tedster

4:40 pm on Jun 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Under the normal algo (which is static until manually updated, presumably)

My understanding is that Google's entire algorithm is now dependent on machine learning. Human quality raters give feedback on various SERPs and that creates seed sets of less than optimal rankings. Then the machine-learning program ranges over the data signals that Google collects to learn how those factors could be applied to create a more optimal SERP. When the new machine-proposed algo passes some quality checks, it becomes live.

why do so many people report that their Pandalized rankings are remarkably stable even after making radical changes?

And that is a mystery - although many people have reported incremental gains in traffic during those interim periods.

The challenge we face is being limited to anecdotal reports, for the most part. It's hard to arrive at anything rigorous that way. I know I certainly don't have a big pile of Pandalyzed sites to work on in a scientific way.
This 238 message thread spans 12 pages: 238