|Analyze Panda Losers That Don't Fit The Mold|
So we've had two iterations of Panda now, and with each iteration has come a publish list of the biggest losers. We all know, if we're honest, that a lot of the losers on those lists deserved to lose and lost for obvious reasons.
The point of this thread is to pick out the sites from those lists which DO NOT fit that mold, sites which it's not obvious why they lost, and figure out why they were hit.
In doing so, maybe we'll understand why Panda has hit so many here who don't seem to deserve it either. Here's the list of sites to discuss, I suggest we take them one at a time and simply go down the list one at a time and each list reasons we think each site might have been Pandalized. Once we think we've come up for an explanation for that site, we check it off and move on to the next one:
@brinked "google does not use bounce rate, I have stated many times I wish they did!"
How do you know this & that it is definitively true for Panda? No one knows what factors are used in the new algo to determine ratings in the SERPS.
Are you an inside Google authority? :)
A thought about ads "above the fold"...
I do see many Pandalized sites with ads at the top of every page. But I have another idea that I am giving thought to...We have mentioned the top navigation being a commonality among some of the Pandalized sites we have examined. On my Pandalized site, according to WMT, the hardest hit pages (dropped -200 to -400 positions) happen to be the pages in my top nav...they were also the thinnest (before I added content). They have affiliate links. I wonder if a thin page (with ads) are being classified as ads, and if linked to in the top navigation would be deemed one big ad above the fold? My 5 hardest hit pages are all linked the top nave, and all have affiliate links. Nevermind the other 100+ content pages, those 5 navigational/thin/affiliate pages brought the whole site down.
So...top navigation could be deemed a large ad if they link to thin pages with ads? It's reasonable to have a top navigation that links to thin category pages (which may have ads) or money-making pages (because we want people to go to our money-making pages). This may be posing a problem, algorithmically.
[edited by: crobb305 at 8:31 pm (utc) on Apr 16, 2011]
@SouthAmericaLiving - It would be very very easy to outsource (for very cheap) bounce rate attacks on competitors. I'm sure Google knows this, and while they may look at bounce rate, i'm sure it would never be a major factor.
|Daniweb... ...The home page shows almost no content about the fold for a 1240 width screen. Not so when you reach the actual forum threads. |
I'm not seeing much content above the fold on a 1024 x 1280 screen for the main forum pages either... at least not with a couple of toolbars and the browser window not completely full screen.
In the Daniweb "Windows Software Forum", eg, with just two large ads at the top, I'm only seeing one forum thread peeking up above the fold... and the third thread is an ad disguised as a thread as well... and all are designed to pull my clicks away.
Beyond this, several factors...
There's a global mega-menu up at the top (with some nested menus as well) that's simply got to be hugely confusing to Google. There are also probably 100 tag cloud links at the bottom of the page, and for the most part they're pretty thin pages.
Even assuming that Google has implemented a Reasonable Surfer Model, with all those links distracting from the main content, the 30-40 content threads displayed probably don't get an adequate share of link-juice/PageRank, and it's probably impossible to prioritize link-juice distribution within the site.
Also, on the thread links there's a "feature" that I haven't seen on any other sites... If you mouse over the thread links, the title attribute displays a huge snippet from the first post of the thread. I could argue this two ways...
a) that it might help users by "previewing" the thread for them, without a click, thus improving the user experience...
b) or that by reducing click-throughs to threads (if that's what's happening) it might be sending a signal to Google that users aren't finding what they want on the page. There have been analogous discussions about Google Instant's Web Previews.
The thread titles apparently aren't well moderated or cleaned up... and semantically they are not helping Google classify the page. I see similar problems in much of the user-generated content sites that have taken a hit.
It's not that simple. Instead of merely thinking bounce rate, think about something more complex, like what other data Google has about the searcher and the query, and what the searcher does after the bounce.
|@SouthAmericaLiving - It would be very very easy to outsource (for very cheap) bounce rate attacks on competitors. I'm sure Google knows this, and while they may look at bounce rate, i'm sure it would never be a major factor. |
Are we talking about bounce rate calculated by Google site analysis stats or ībounce rate attacksī (that I admittedly know nothing about but can pretty much guess are not the same thing)?
Hypothetically what I would do, if bounce rate was a factor and I wanted to hurt my competition, is pay someone to send lots of fake traffic to a competitor, and make all of them bounce. If they had 3million visits a month, i would try to have someone send 3million fake visits a month. You would have to make sure that the people getting traffic for you aren't doing type-in visits, but would have to send queries to google, locate the site, click it, bounce, go to my site, wait 30 seconds, close. I don't think it would be hard to figure it out, and that is why I don't think bounce rate will ever be a major factor.
@RobertCharlton Another think I notice about daniweb is that they have all of their user pages indexed, and aren't nofollowing links as they should.
Just look at the content to ad ratio there. Not much unique content and the page is indexed. I believe Google would see this as a low-quality page.
According to their "Community Activity" they have roughly 33k thread, 1.5m post, and roughly 900k user pages.
*EDIT* math failure* Over 30% of their indexed pages are low quality - and it's affecting the whole site.
Just my 2 cents
interesting read: [guardian.co.uk...]
[edited by: Nano at 9:09 pm (utc) on Apr 16, 2011]
I don't think Daniweb was hit because of ads. There aren't many. No Adsense.
Personally, I think the content was shallow. It is a vary plain-Jane looking website. The content is well above the fold.
|So...top navigation could be deemed a large ad if they link to thin pages with ads? |
Or the top navigation pushes the content lower.
PRNewswire has a subdomain with 155,000 pages indexed just like this:
In the actual news release part of the site, there are 626,000 pages indexed. That's a pretty intense ratio of actual content to scraped crap.
Another thing is they don't nofollow their outbound links. People use their feeds for syndication as well, so I have a feeling tons of these pages are outranked by other sites with the same content.
Blogcritics.org has a problem with thin content too. Out of about 210,000 pages indexed, 105,000 of them are tag pages. Their ads are iffy too, with two units between the heading and content. Then there's the dofollow links.
Technorati has about 49,400 actual articles indexed, but the rest is pretty thin. There's a page for each blog they track, which is pretty much a dofollow link to the blog and an RSS feed. So that's 138,000 unremarkable pages.
It's also a tag crazy site. They noindex most of their tag pages now, but a few thousand are still showing up in Google just waiting to totally drop out. I don't doubt they did this because of Panda.
They've also got a whole lot of pages like this indexed:
If you query those in Google it will tell you something between 300,000 and 1,000,000 but most of them are supplemental. Yuck!
With Daniweb they just noindexed their tag pages too. There's only about 300 still showing up in Google.
It's worth noting that they, as all the others do, fit the mold of not labeling any of their advertisements clearly as "advertisements". Something which all the sites on the "winners" list do.
Also they're almost all user generated content, and there's been a lot of talk about Google intentionally penalizing sites based on user generated content in favor of promoting brands and expert opinions more. You won't see ANY user generated content sites in any of the winner lists.
But that's a shame, because like most of the sites we've looked at so far this seems like a really good site. It's actually really easy to read, the community seems great.
I'm not seeing as big an issue with content above the fold as you guys, I'm seeing about 30% of the bottom area above the fold containing content when you click into individual forum threads.
One big thing is, again like almost all of these sites, they use a big dropdown menu which causes them to have a LOT of internal links on their pages. I really think that's hurting them.
[edited by: Shatner at 9:28 pm (utc) on Apr 16, 2011]
>> With Daniweb they just noindexed their tag pages too.
Yeah I think I saw a post from them in the Google webmaster forum saying that was one of their post-Panda actions. I think I saw Cinemablend and Technorati saying they'd done the same as well. Probably a good move on their part.
Some sites that rely on user generated content did really well, but I don't think it's a coincidence that it's the ones that aren't really useful for link building or driving traffic.
tedster, I really do not think panda has anything to do with the content on the homepage. To me, this is more about the inner content pages and judging the site as a whole and not on a page by page basis.
These sites were not only hit because they are not labeling there ads. These sites were hit because google analyzed these sites as a whole, ran it through there formula or as we like to call it algorithm and decided that these sites were not up to their standards.
I am noticing that only websites with a lot of pages are being hit. My low quality websites with a handful of pages were not hit and I strongly believe a lot of them should but I am thankful they werent. They are poorly written, bounce rate on them is about 50% and ads everywhere.
If I had to make an educated guess, I would say that google is calculating how much space the ads take up versus how much space the content takes up. Where are the ads located? adsense ads in the top left column is a known position for MFA type sites, they look like menu links so people click, I noticed this on cinemablend.
How many pages have ads that have little unique content? Are they running ads on user profile pages, secondary pages that provide little value to the every day reader?
How many ad units are they running? Are they trying to blend the ads in with the content as to confuse the reader? Daniweb sure is, I was confused and I am a very experienced web browser fully capable of spotting ads from content.
As for bounce rate, I am not going to touch that subject, the bottom line is if you have good interesting content, your bounce rate will be reasonable. High bounce rate doesnt mean a page is bad, if I want to know what does PMI mean, I google what is PMI?, click on the first result, glance at it for 2 seconds and see it says private mortgage insurance, then I click off..I got what I wanted, but I am counted as a bounce, is that very fair?
Google wants us to think that they are a lot smarter than they actually are. Google wants us to believe that they have some magical algorithm that can detect if content is high quality or not. You don't have to have an essay written by someone with a PHD in order to have quality unique content. Text is not the only kind of content on the web.
Find me a site that was hit by panda that does not have ads. There are so many websites in the world that dont run any ads, find me one that was hit by panda.
|Text is not the only kind of content on the web. |
This is true. Informational graphics are very useful. In fact, as a searcher, I really prefer to see bullet or graphical summaries. Long text documents where I have to use cntr-F to search for the related term or have highlighting turned on are often a back-button tap for me. I want the answer to the query immediately, and long/wordy articles are often a turn off for me.
|With Daniweb they just noindexed their tag pages too. |
That doesn't address the structural issue, which is that they are sending a large part of their link juice into a black hole.
cinemablend must be reading my posts. They removed the top left column adsense ads and labeled all there ads. Good for them, should be a step in the right direction in helping them recover.
|Billy you're doing that calculation on interior, content pages not index pages right? |
Looking at PopCrunch.com
HTML Area = 900 units
Content = 110 units
% Content = 12.2%
If you look at an article from technorati you are overcome with ads. Ads pushing the page down, ads on the right, ads above the content.
here is a link of what the average user sees: [img231.imageshack.us...]
I have dulled out the content as to highlight the ads. As you can see there are 4 prominent blocks of ads for only a paragraph and a half worth of useful content. There is a total of 9 ad blocks (if you count the link ads in the actual content) on that page. Ratio is way too high IMO.
[edited by: brinked at 10:26 pm (utc) on Apr 16, 2011]
@brinked Nano said he was emailing the sites we discussed. Maybe he told them or pointed them here?
>>>How many pages have ads that have little unique content?
But if you look at these "don't fit the mold" sites we're analyzing here, I don't really see any pages that have "little unique content". They are all pretty content-rich.
Reminder: Let's avoid broad statements and stick to applying things to these specific sites.
>>If you look at an article from technorati you are overcome with ads. Ads pushing the page down, ads on the right, ads above the content.
So are you saying maybe those ads there in your technorati screenshot would be fine... if they weren't all crammed above the fold.
For instance they could move both of those Adsense ads to the bottom.
>>>That doesn't address the structural issue, which is that they are sending a large part of their link juice into a black hole.
So you're saying they should also remove the links to the tags from their pages? Yeah that makes a lot of sense. Noindexing them was the right first step but now they're linking to noindex pages which is a waste.
Tags are useful for some things, maybe they could be used only in specific instances or something rather than as a default.
>>>Text is not the only kind of content on the web
I wonder if Google Panda understands this? Because some of the sites we're examining here have legitimate, all image pages or video pages. Does Google see those as thin content? Obviously it shouldn't but... ?
Could Panda be mistaking pages like that as thin content?
@Shanter, definitely valid content, but on that Technorati page, a giant ad popped in at the top and scrolled the whole page down (at least for me on my first page view -- the second time I went to the page, it didn't do that).
[edited by: crobb305 at 10:26 pm (utc) on Apr 16, 2011]
Excellent discussion on this thread Shatner + all
Many of the latest comments and suggestions reminded me of Google arbitrage guidelines from a couple of years ago. Maybe they circulated here too about that time?
They were intended to define the minimum requirements for landing pages from Adwords for websites involved in arbitrage (nowadays read MFA?); following the guidelines would help avoid the dismal quality score that arbitrage sites attracted.
I wonder if a similiar approach has moved from PPC to organic landing pages.
The main points centered around
1. primary content had to be the main focus of the page
2. limit on % ads above the fold on 1024x768
3. definition of content above the fold excluded nav bars, search boxes, unecessary whitespace...
4. no deceptive methods to entice clicks on ads (images, 'recommended links' etc).
5. landing pages had to have significant links that allowed users to click away from the page without activating an advert
And the item that really reminded me of Panda
6. for advertisers with large websites, the majority of the pages needed to meet the requirements (not just the landing pages) = site level ban not page level ban
As this thread has evolved I also agree that point 4 may have taken a step forward, making the marking of ads as 'advertisements' (especially if deceptively placed inside articles) now mandatory.
Shatner, I am saying we are seeing a pattern here. Most of these sites have too many ads in and around the content. Blogcritics is very much similar to technorati:
in that screenshot you can see a whopping 6 ads, 2 banner ads, 2 adsense ads, 1 left column ad in a position known for MFA sites and an amazon affiliate ad. That page also has a total of 9 ad units, much like technorati.
@crobb But are we sure Google knows that's valid content?
One thing about the winners list, they don't have a lot of content like that.
Plus technorati also uses links in the body. If you highlight the links, they say "Shopping links add by skimlinks" [skimlinks.com...]
See the word iPad 2 that is linked 3 times in that article? The author has no control over this, it's automatically added by technorati and they earn affiliate commissions on it. How do I know? I was a Technorati writer
[edited by: Nano at 10:29 pm (utc) on Apr 16, 2011]
@brinked So we're back to the number of ads theory?
Compare to the winners list. Do you see any there which have a similar number of ads? I do. I definitely do in the list of similar yet unaffected sites too.
To me the difference isn't the NUMBER of ads, but where they're used, how they're used on the page.
The winners all have them labeled as "Advertisements" and they have them spread out throughout the page instead of crammed above the fold like your example.
[edited by: Shatner at 10:28 pm (utc) on Apr 16, 2011]
|@crobb But are we sure Google knows that's valid content? |
Definitely unsure (see my post above about infographics and bullet summaries...the type of things many users prefer to see over long text documents). If Google is classifying everything that isn't in paragraph form and of a certain word count as "thin", they are way off the mark.
[edited by: crobb305 at 10:29 pm (utc) on Apr 16, 2011]
good find nano, I edited my post to reflect that ad, I originally had 8 ad units, now I can see there is 9. I would also count that as in content ads which is probably something panda is looking for.
@Nano Aren't those links like the kind provided by those contextual link providers like kontera, infolinks, vibrant media?
Those are used by a LOT of sites, way more than the 12% affected by Panda.
On their own I doubt they're a problem. Maybe combined with all the other stuff, some small influence though.