| This 33 message thread spans 2 pages: 33 (  2 ) > > || |
|Alexa data on Pandalized sites from Google thread|
| 10:48 pm on Nov 17, 2011 (gmt 0)|
I went through the first five pages (200 posts) of the famous Google Forum thread "Think you're affected by the recent algorithm?". I found 31 sites that were actually hit by Panda (used Alexa), most of the full recoverys were in September, others recovered temporarily then crashed again. It was a very strange tour.
The number in parenthesis is the current Alexa US rank.
#1 (2588) recovered late September
#2 (10,948) recoverd mid May, fell again late June
#3 (26,451) down and flat
#4 (38,329) down and flat
#5 (56,683) down and flat
#6 (6,884) roller coaster, and jagged
#7 (10,813) up and down, full recovery September
#8 (13,688) down 2010, down Panda, some recovery September
#9 (61,027) down, summer recovery, down worse
#10 (37,401) way down, way up, down (IN rank)
#11 (77,601) down and bumpy, was falling since mid-2010
#12 (9678) down and stayed down
#13 (18,697) down and flat (big 2010 run-up)
#14 (1882) down and still drooping (down since 2010 peak)
#15 (93,976) down, up, flat, not 100% it's Panda
#16 (4809) down, down, big July recovery, par
#17 (755) down, down, drooping
#18 (22,913) down, down, spiky
#19 (46,275) down, down, down
#20 (4,367) down, up, down, flat
#21 (25,842) down, down, drooping
#22 (54,886) down, drooping
#23 (47,232) down, down, flat
#24 (2,439) minor Panda hit and flat after big fall earlier
#25 (34,449) down, down, flat (with seasonal spikes)
#26 (12,817) down, way up, way-way down, flat
#27 (28,013) down, drooping
#28 (54,128) down, down, drooping
#29 (25,21) down, trend up, full recovery late June
#30 (56,389) down, down, flat
#31 (31,800) down, down, flat to 2010 baseline
So there were only four full recoverys in the bunch so far, but that beats none. It seems to me that penalized sites were in a small number of subject groups, but they are self-selected reporters to the Google thread. I'll look for reasons as I have time in coming days.
Writing it up with subject area and site descriptions, maybe some webmaster interviews, would make great link bait for somebody who wants SEO traffic. Yuk!
[edited by: tedster at 7:53 am (utc) on Nov 22, 2011]
| 6:49 pm on Nov 18, 2011 (gmt 0)|
Thanks for the research and the report - I'll bet it was a strange trip!
If I were to summarize how I now understand Panda's target, it is aiming to demote pages/sites that are primarily built to rank rather than primarily built to serve their public. These are pages that avoid being spam but they really don't deliver quality either. They are often the sort of "bare minimum" site that has a lot of search engine savvy but in the end leave you saying "where's the beef?"
|So there were only four full recoverys in the bunch so far [out of 31], but that beats none. |
That feels about right to me. The biggest obstacle most Pandalized sites face is the assumption that they were "wrongly" demoted. Once they accept the fact that this new algorithm component doesn't like their site, they get serious about understanding it at only then do they stand a chance of recovering their traffic. And that only happens if they change their understanding of what a website needs to be.
Of the sites that I know recovered, there are several different patterns for the fix:
1. Some sites did a very thorough inventory and removed a lot of marginal content. Often these sites also beefed up their quality pages.
2. Some larger dynamic sites used noindex for a pile of stub pages.
3. Some recovered sites fixed serious technical issues - such as major canonical snafus that caused ten times the number of URLs to be indexed as there were actual pages.
4. In one case a major DMCA campaign to remove scraped versions of their content seemed to do the job.
So far I have not run into anyone who recovered by changing their template/layout but nothing else. With all the talk about ads and "above the fold" content this seemed surprising at first. But thinking a bit more about it, even with the layout change the content probably started out shallow in the first place.
For years I've heard Adsense players talking about making their pages "slippery" rather than "sticky". That sounds like a recipe for shallow content to me, no matter how you structure the top of the page visually.
| 7:39 pm on Nov 18, 2011 (gmt 0)|
Do you think they are still rolling out any more Panda updates now, or just tweaking the current algo?
i.e. how long might it take for a site to "recover" these days?
| 8:05 pm on Nov 18, 2011 (gmt 0)|
Looking it from the other end, which sites did not hit, all I see in our neck of the woods is zero or shallow content, often not related to the query, in the top 10, and always the same sites using a template. In fact I might as not bother with content..
Given the above , which may be to skim them off, and a few other signals I am getting, I would anticipate a further tweak very soon.
Time to recover - possibly never at the moment.
Nice data Content_ed, thank you for taking the time.
| 4:15 pm on Nov 20, 2011 (gmt 0)|
Having looked at large numbers of sites that got hit, I can see problems with some of them. Of course, I see the same problems, in the same magnitude, with sites that were Panda neutral or beneffited.
I don't want to bore everybody to death with protestations about my two Pandalized sites, I'll only say that the professional SEO's who have looked at them just say, "Wow, they really screwed up here."
I've spent a couple hundred hours this year filing DMCA complaints, which gives the minor satisfaction of seeing Google's search results fill up with "A page has been removed" results when I do test searches these days. But one thing I learned from this experience, perhaps too late, is that infringements are far more targetted than I had every realized.
I kept coming across large numbers of infringements on single pages, in the thousands, created by single individuals using various free blogging platforms and syndication services to try to drive traffic/PageRank to their affiliate sites. Before I studied it, I always though that infringements were basically one page at a time things, driven by lazy kids and individual bloggers who wanted attention.
Now that I realize there are particular pages drive the bulk of the problem, I can concentrate on cleaning up 100% of the infringements on the less popular pages first, which would reduce the percentage of content Google sees as "duplicate" on my site. I'd say that 5% of my pages account for 95% of the infringements.
I don't know how to judge between a update and a tweak, unless it's the magnitude of the results. I'm sure they know they have problems, but Google's married to the thing because the desparately wanted to depress traffic to some of the largest scammy sites without manual intervention, and the algorithm has done that.
And it really doesn't help that when they solicited feedback in the thread I got the site list from, the majority of the webmasters posting weren't impacted by Panda at all. So if Google simply put all the sites in a spreadsheet and is monitoring the results, they may think they've solved 90% of the problems already. You'd be surprised how many of the sites who reported problems have seen soaring traffic this year, whether because they are still building out or because they beneffited from Panda. A lot of them saw a change in their key phrases and freaked out, even though their Google traffic was rising!
As to recover times, the fastest "permanent" recovery I saw was a single Panda cycle, about a month and a half in that case. In other cases, I saw traffic shoot up again after a week, sometimes way over the intial level, and then come crashing way down at the next major Panda update. Since I'm just looking at Alexa data with no site history, I don't know if that means they did some SEO work that temporarily helped, or if it's just whack Panda.
I did the digging for a specific reason. I commented on a thread last week that I had never seen a documented Panda 1.0 recovery, and I figured I better find out for myself.
Seeing that sites can recover from Panda, I'm more inclined to experiment than I would be if I thought I was just pulling my own chain.
[edited by: tedster at 7:54 am (utc) on Nov 22, 2011]
| 11:46 pm on Nov 20, 2011 (gmt 0)|
Thanks Content_ed - interesting.
I have one of those sites on your list. I've pretty much applied all 4 of Tedsters things that were fixed on other sites. And, as your list shows... no change. The amount of time that has gone has been huge (many hundreds of hours).
Scraper issues are almost impossible to keep on top of. I've done some very drastic things to the site, but as yet nothing.
One thing I am struggling with is that all referrals from Google seem to be for very long tail keywords, making it much harder to lower bounce. Pre-panda many pages ranked for much "bigger" keywords - making the content of the page more aligned with what the searcher was looking for.
Thanks again for all research. It all helps.
| 1:25 am on Nov 21, 2011 (gmt 0)|
Have anyone not only considered but changed the majority of page to "noindex" , like for the sake of it, noindex everything but your main link structure lets say everything under tier2 and see what happens on the next panda roll, nothing to loose anyways if you are in the no man's land. Are you running a forum that's not so popular , I feel that qualifies as a phat meal for panda these days. Lots of thin pages to drag you down. Same goes for canonical , if Panda has sit on your face already whats the point of keeping canonical I have yet to see one instance where canonical tag is enforced in the SE eyes.
| 2:00 am on Nov 21, 2011 (gmt 0)|
|Have anyone not only considered but changed the majority of page to "noindex" , like for the sake of it, noindex everything but your main link structure lets say everything under tier2 |
....very popular tactic from what i hear, but this alone will not bring a site back. Needs to be augmented with a rapid building back of content / usability improvements.
Tedster - I've got to say I'm surprised that you've heard of no sites coming back that have adjusted designs. Are you saying in isolation, or as part of the overall "Panda fix" ?
MC's passing remarks to Watson [webmasterworld.com...] about the ease of returning from Panda, curiously seems to elude a lot of sites in the Alexa list. Why the conflict -why don't folks just get it, like Matt reportedly said ?
| 2:25 am on Nov 21, 2011 (gmt 0)|
Just to remind people, I have a site that recovered from Panda and at the next update, jumped nearly 400% overnight. I did NOTHING to the site that I can remember. Certainly no new content, didn't see any new links in WebmasterTools, no deletion of thin content because there wasn't any, none of the things people take for granted are needed. The site is only a couple dozen pages and it gets around 2K visitors a day from Google now.
| 3:09 am on Nov 21, 2011 (gmt 0)|
|jumped nearly 400% overnight |
@Content_ed - Was that higher than your pre Panda traffic levels ?
You mentioned that the Alexa site's covered a small number of subjects. I was wondering if "similar content" was playing into this a bit? Do you have any thoughts.
| 3:40 am on Nov 21, 2011 (gmt 0)|
|Tedster - I've got to say I'm surprised that you've heard of no sites coming back that have adjusted designs. Are you saying in isolation, or as part of the overall "Panda fix" ? |
|So far I have not run into anyone who recovered by changing their template/layout but nothing else. |
| 4:38 am on Nov 21, 2011 (gmt 0)|
a. Why are old dated websites ranking with old content so good ?
b. Why are scrapers thriving ?
c. Why are websites regularly updated being hit?
To me it all looks like "Content STICKINESS" , I wont go into details but seems like that hits the nail right where it's supposed - the flaw of panda. Forget about content quality(It's for the users, not panda).
Just brief overview :
a. Content is already established
b. Temporary gains nullifies original and scraper
c. Content can not stick, not enough original exposure
This topped with authority ranks(don't get me started there), shady(slow mo) caching,unpredictable unannounced updates , and miss info has lead to this tragedy with such an easy solution.
I am certain if you go deep into this points you will find the light in the tunnel.
I will try and give more details on the way out of the tunnel !
P.S. there is a SERPs update coming as usual .
| 5:25 am on Nov 21, 2011 (gmt 0)|
|P.S. there is a SERPs update coming as usual . |
Apparently there already was a minor Panda update on Nov 19 - see [webmasterworld.com...]
| 2:55 pm on Nov 21, 2011 (gmt 0)|
That's above pre-Panda levels, ie, with no site changes, getting nearly 4X as many Google visitors as early April (this site was hit by Panda 1.1). I did write it up in a trhead here a month or two ago.
It's why I'm a little surprised that nobody has seen a site recover on template redesign only. I would have thought somebody redeigning a template would have recovered at random by now, not because of the redesign, but because a Panda update changed the penalty criteria.
| 9:27 pm on Nov 21, 2011 (gmt 0)|
|I would have thought somebody redeigning a template would have recovered at random by now |
For a variety of reasons i would have thought so to - site design would have to be a part of the quality signal. That's what I'm being advised from a number of different sources - however, I'd have to agree to seeing no measurable evidence of it in isolation. That's why i was surprised.
It's striking how few reports on the finer detail we are getting back given the amounts of folks effected over the last 9 or so months. The Alexa list kind of almost looks like a survey, that makes me feel that a lot of folks have just given up, or that bigger sites represent much more of a challenge to turnaround.
| 10:01 pm on Nov 21, 2011 (gmt 0)|
@Whitey - I can only speak for myself. In the first 6 weeks post-Panda 1 in Feb, we threw everything at it. Immediately removing about 10-15% of content (thin). Addressing technical issues (noindex on user profile pages, tag pages, and canonical where appropriate). And going after scrapers.
There was no recovery so we had to leave it alone as it became uneconomic.
However in the last 3 weeks I've got back fulltime at the site -- doing an even more drastic prune of older content, and attempting to improve quality on remaining work.
It is very difficult work to do on a larger site (thousands of pages).
| 10:06 pm on Nov 21, 2011 (gmt 0)|
In our neck of the woods, the only thing I notice is that sites doing well have not made any updates to design for 12 years, and all have a left side site wide menu, on a 3 column design.
|I would have thought somebody redesigning a template would have recovered at random by now |
For example, red widgets, blue widgets, pink widgets etc, rather than on a red widget page, red widget places to buy, red widget info, red widget specifications, red widget sizes... etc
may be a coincidence .. the actual real content is rubbish, if it exists.
| 10:09 pm on Nov 21, 2011 (gmt 0)|
I expanded the list out to the first 50 sites, that probably accounts for around a third of the Pandalized sites in Google's 500 complaint spreadsheet, since they included everybody who reported rather than only those who were impacted.
I take the opposite conclusion on big sites. Of the 17 sites that were top 10,000 on Alexa (I rounded a little to allow for their pre-recovery loss), 5 recovered fully, around 30%. Only one or two of the 33 lower ranked sites recovered fully, call it 6%, so big sites were five times as likely to recover!
Of course, that doesn't take into account that professional webmasters were more likely to find and report on the Google thread than hobbyists, maybe if I went out to the first 100 sites the ratio would change.
I do suspect that a lot of smaller site owners, like myself, just gave up and didn't do anything for six months. I didn't really start working at it until my hobby site came back on its own in September.
| 6:16 pm on Nov 23, 2011 (gmt 0)|
Very interesting insights...
| 6:24 pm on Nov 23, 2011 (gmt 0)|
You can't tell for sure if a site was pandalized from Alexa data (or even compete.com data, which is more accurate). I have a site that I know for 100% fact was crushed by Panda with no recovery but if you were to look at our Alexa data you would have no idea.
| 9:47 pm on Nov 23, 2011 (gmt 0)|
No third party measurements are anywhere near 100% accurate, but Alexa readings have been the best for sites I know the real traffic for and Compete has been off. Never tried to figure out why. But in any case, the idea is if you look at enough sites, you'll run an average in your head.
| 2:37 am on Nov 24, 2011 (gmt 0)|
I've checked Compete several times on sites I manage and it's always off. Alexa data is 3 mos behind too.
| 5:21 am on Nov 24, 2011 (gmt 0)|
For what it's worth, the most 'accurate' and publicly shared traffic numbers that I've come across are sites using Quantcast. The numbers are somewhat different from Compete, Alexa and others, as well, in most cases. One example, Daniweb recovered/lost...etc. their traffic on Panda dates that match up with Quantcast graphs/stats.
| 6:35 am on Nov 24, 2011 (gmt 0)|
I haven't found any intelligence that can be relied upon.
Today I did a check for Google organic search traffic on a well distributed SEO intelligence tool ( sorry can't mention it here ) ... one of the best. Thought it was great on a couple of sites i know well ... the traffic numbers, dates + other data all corresponded. But as i expanded to other sites i also know well, it started to become wildly inaccurate.
So you may get lucky here and there, but if you want accurate data go and work for Google and get permission to draw on their data. Otherwise, at best it's indicative.
| 10:09 am on Nov 27, 2011 (gmt 0)|
large site = large budget = many eyes on the problem.
It would make sense they would find their problems and have the resources to fix them.
| 2:39 pm on Nov 27, 2011 (gmt 0)|
Large sites, which are generally forums or "community" generated content, pretty much all had issues with thin content and duplicate content that could be cleaned up if there was some genuine value at the core. But large sites that were driven by syndicated garbage or scraped product reviews would be out of luck. On the other hand, small sites that have no thin content or duplicate content can't clean up when there's nothing to clean.
And I think I mentioned in another thread that I didn't do any work on my sites, post-Panda, for nearly six months, until my hobby site recovered on its own. I wouldn't be surprised if many other small sites took the same approach. I just spent tons of time filing DMCA notices.
| 1:45 am on Nov 28, 2011 (gmt 0)|
|I can only speak for myself. In the first 6 weeks post-Panda 1 in Feb, we threw everything at it. Immediately removing about 10-15% of content (thin). |
@synthese - reports are that this is nowhere near strong enough to break Panda. 10-15% insn't exactly throwing much at the problem. I'm seeing well circulated commentary that ratio's as high as 4 spammy pages to 1 good page could be enough to trigger Panda and getting back is tougher than going under.
Eliminate all spam content and start again. That's why some large sites that depend on aggregated / thin content are in trouble - it takes a radical shift in thinking and resources to sort out. And simply writing unique content alone isn't enough. Don't discount links either, even though Panda is on page - crap links to crap pages is a matching disaster.
Some of those sites in the Alexa list were likely big sites? ( even though i would only use the data at best indicative ) Y/N ?
| 8:11 am on Nov 28, 2011 (gmt 0)|
Yeah, someone tell me how to effectively do a DMCA on 23,000+ scrapers (for just one of our pages, alone) who stole our home page text "word for word". One, who accounted for 3,300 links alone, is already shut down, but still indexed on G. Another has sprung up in its place. Looks like G has its work cut out for it. In the meantime too bad G doesn't consider us as desirable as they do.
| 2:26 pm on Nov 28, 2011 (gmt 0)|
The large sites on the Alexa list were much more likely to have recovered than the small sites.
Over the past month or so I've put a huge dent into the scrapers, autobloggers, and most importantly, stolen and then syndicated pages from my site, that had filled up Google's index. In many cases, Google reported well into five figures of full or partial copies of my more popular pages. Feel free to contact me if you want me to look at your site and see if I can suggest anything, just curiosity, I don't sell services:-)
You'd be surprised how fast (relatively speaking) you can clear scrapers out of Google if you use their DMCA Dashboard and fill each complaint with URLs. But keep in mind that getting the URLs removed from the results DOESN"T get them removed from Google's index, unless they are Google URLs, like Blogger. To really clean up, it's much more effective to keep a seperate list of Adsense infringing URLs for each page of your's as you go, and DMCA Adsense in a single batch. The Adsense DMCAs are highly effective at getting the sites to remove the pages from the web because they don't want to get kicked out of Adsense.
And if you're in a big hurry to get the removed pages deindexed, you can also file them with WT URL removal tool (not the one in the crawl errors section). As long as they comes back 404, Google will remove them, but if they've been replaced by an empty template, you're stuck waiting.
| 3:06 pm on Nov 28, 2011 (gmt 0)|
Can someone please explain to a real newbie apprentice what Panda is?
| This 33 message thread spans 2 pages: 33 (  2 ) > > |