Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Understanding Panda - Thin Content vs Low Interest Content

         

getcooking

4:17 pm on May 10, 2012 (gmt 0)

10+ Year Member



I've got a 17 year old site that was hit with Panda 1.0 and every iteration since. Yay me.

I've identified the problem areas and have been working to fix things up but so far no recovery. While I do still have some thin content on the site that hasn't been beefed up yet, I also noticed that a lot of my pages target very low volume, obscure search terms. They are very valid in our niche, but Google's Adword's keyword tool shows they only yield a few hundred queries globally per month. It made me start to wonder if that was my problem more than just thin content - but rather having too much low interest content? Could having too many pages that look targeted to obscure terms be hurting our higher volume terms? What's interesting is that the low volume pages rank very well for those obscure terms. Usually in the top three positions, and frequently at #1. However our higher volume terms have all been demoted by Panda. Thoughts?

And a related question, what are current thoughts on noindexing offending pages for Panda recovery? I see some people say that it works and for others, it doesn't. My plan was to noindex low volume and thin content pages until I could either develop them or merge/301 them but I don't want to make Panda more angry at me either.

getcooking

6:06 pm on May 15, 2012 (gmt 0)

10+ Year Member



I don't think I have since Panda hit (I did change things a few times pre-Panda). I have thought about it though. I have 10 results max per page right now. I used to have 20 per page but decreased it when I added more content for each article listing. Perhaps bump it up a little so more articles are listed? More relevant "meat" for the page? Oh man, I'll kick myself if changing from 20 to 10 articles per page is what triggered Panda. I don't remember when exactly I did that - it was pre-Panda though.

Some categories don't have more than 10 articles. None have less than three though (and I'm working on consolidating some of those thinner ones). Some of the categories with only a few articles actually rank really well - which is what has always thrown me about the "thin content" aspect.

FaceOnMars

6:52 pm on May 15, 2012 (gmt 0)

10+ Year Member



gc: It may help, but I truly don't know if this is a significant factor. I just know it's something which I can control and figured it was worth a try. I haven't seen any noticeable increase when I increased my results from 20 to 50. I was hoping that it might provide more meat as you alluded to in so far as categorical integrity by including more related semantic content per category page.

As far as some of the thin content pages ranking very well: may the competition isn't quite as high for that particular topic? Or, there was something very unique/interesting about that particular content ... perhaps along with some deep backlinks to those pages from external sites?

Is there a high degree of consistency of semantic content on those category pages which dont' rank well? In other words, do you have much "spillover" or duplication of "product info" accross different categories? (This is one issue I need to resolve on my site, but it involves a policy issue and will take time.)

getcooking

7:42 pm on May 15, 2012 (gmt 0)

10+ Year Member



As far as some of the thin content pages ranking very well: may the competition isn't quite as high for that particular topic? Or, there was something very unique/interesting about that particular content ... perhaps along with some deep backlinks to those pages from external sites?


Competition may definitely play a role in this. My thinner content for less competitive terms often (not always) ranks great. That may be just because the sheer lack of competition. However, these thin pages (despite ranking well) are probably the reason why many of my higher competitive terms aren't ranking. The thinner categories don't have very good backlinks either, I think it's just the lack of competition that keeps them ranking at all. My good categories with good backlinks still rank very well in many instances. There is just the gray area between thin content, a good backlink profile, and competition that I need to figure out.

I think that is what's been throwing me off so much about where to begin on fixing my site. Just because a page ranks doesn't mean it's not the problem. I've been afraid to modify any pages that still ranked well but perhaps I've been approaching this wrong.

Andy Langton

8:18 pm on May 15, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google simply disregards internal duplicate content (vs. penalizing)


I don't really like the phrasing of this ;)

It's true that there is no 'penalty' for duplicate content. But what happens in most cases is that Google drops all but one of the duplicates to prevent repetition in its results - and it might not keep your 'first choice' URL - or the version you're trying to rank. In addition, you lose the value of any ranking signals going to the duplicates that Google drops.

So it is not a problem to ignore, and while Google's solution works fine for Google, and keeps results cleaner, this isn't the same as working well for you - duplication is usually a problem worth addressing.

FaceOnMars

9:07 pm on May 15, 2012 (gmt 0)

10+ Year Member



It's true that there is no 'penalty' for duplicate content. But what happens in most cases is that Google drops all but one of the duplicates to prevent repetition in its results - and it might not keep your 'first choice' URL - or the version you're trying to rank. In addition, you lose the value of any ranking signals going to the duplicates that Google drops.


I see what you're getting at regarding duplicate content in general terms ... such as comparing page A to page B as being exact duplicates, but I suppose this is a bit more nuanced duplicate content issue regarding what I believe getcooking & my site might entail. In other words, for any given category there will be an internal results list of 10 listings. The results list will only display a short summary of each listing (similar to the way google displays results) ... and the visitor can click on a "read more" link to see the full listing. So, the summary might be 10-20% of the full listing's total content. Consequently, if the summary for each listing is displayed under 10 different categories (along with other listing summaries which might also be cross referenced under multiple categories in a similar matter), at what point does google consider there to be a duplication issue? And if so, how does it handle such a case?

It's easy enough to drop page B in favor of page A, but this is more of an embedded internal duplicate content issue vs. a 1:1 sort of all out self contained copy. So I can't help but wonder if instead of dropping one page in favor of another that they're still maintaining the same perspective, but are now somehow tallying up some sort of penalty along these lines? Perphaps they pick the best version, then somehow calculate a fractionalized dilution (based upon the number of instances the duplicate content appears elsewhere) rank for any page which other "copies" of this subset of embedded copies exist?

FaceOnMars

9:21 pm on May 15, 2012 (gmt 0)

10+ Year Member



Competition may definitely play a role in this. My thinner content for less competitive terms often (not always) ranks great. That may be just because the sheer lack of competition. However, these thin pages (despite ranking well) are probably the reason why many of my higher competitive terms aren't ranking. The thinner categories don't have very good backlinks either, I think it's just the lack of competition that keeps them ranking at all. My good categories with good backlinks still rank very well in many instances. There is just the gray area between thin content, a good backlink profile, and competition that I need to figure out.

I think that is what's been throwing me off so much about where to begin on fixing my site. Just because a page ranks doesn't mean it's not the problem. I've been afraid to modify any pages that still ranked well but perhaps I've been approaching this wrong.


I've generally had a similar experience regarding thin content of low competition still rank well, while high competition categories are more prone to slip in the index ... with a few exceptions on the positive front.

I'd also like to know if there's something to there being a cumulative site wide "penalty" which might be triggered by either too much thin content in some areas or perhaps "excessive duplicate cross-categorical content". If so, it would help to know in particular if it's an "all or nothing" sort of proposition such as a step function where you cross a threshold and consequently place part or all of your site into a "penalty" situation and sandboxed into a tier; or, if it's more of continuous smooth graph where you remove X amount of thin/duplicate content and you should expect to see X amount of corresponding recovery?

Andy Langton

8:33 am on May 16, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see what you're getting at regarding duplicate content in general terms ... such as comparing page A to page B as being exact duplicates, but I suppose this is a bit more nuanced duplicate content issue regarding what I believe getcooking & my site might entail


Nuanced, perhaps, but the principles remain very similar. One thing to keep in mind is that how similar two things are is context-specific, as well as a general indexing factor. So, you may be considered too similar right from the beginning, or too similar to match a particular keyword search.

In terms of category pages where it's essentially rearranging the same snippets, I'd say you're going to struggle to consistently rank such pages. For one thing, Google has to be able to detect simple rearrangements of text, because of the volumes of spam created by taking other content and spinning/re-ordering and similar techniques.

"Generated" content is something Google's doesn't particularly want to rank, and IMO category pages, tag pages and similar often fit that model pretty well.

getcooking

9:03 am on May 16, 2012 (gmt 0)

10+ Year Member



In terms of category pages where it's essentially rearranging the same snippets, I'd say you're going to struggle to consistently rank such pages


I completely agree with this. I think if that's the case then maybe the category structure needs adjusting.

On my site we have over 120K "articles" in over 3K categories. There are some articles that are in 3 categories at most, but typically each article is only in one category.

mendel36

12:46 pm on May 16, 2012 (gmt 0)

10+ Year Member



I think the most interesting aspect of this thread is what is defined as "low interest", which is to say how would I measure it.

Some standards:

1. Number of SERPs (measurable and obvious)
2. Bounce Rate (but really only from Search)
3. Rankings (maybe - but too volatile)


another idea:

4. Inbound Links: I would say this is closer, meaning that if nobody links to it (references it), then we might say its interesting but irrelevant.

5. Referral Visits: While we tend to be focused on SERPs, this measure has the most potential to be a mathematical flag for "low interest".

Simply put, if nobody visits the page except from Google, then you would it expect to drop.

Just a thought.

getcooking

1:08 pm on May 16, 2012 (gmt 0)

10+ Year Member



Simply put, if nobody visits the page except from Google, then you would it expect to drop.


That isn't always the case for me, see my earlier comment:
An interesting note is that visitors are visiting these pages via our site navigation and internal search. So they are coming to the site from a broader search term but navigating to the more specific pages. They just don't seem to be searching Google for that specific page. I'm not quite sure what to make of that or how it might be affecting things.


I suppose in this case it's just a case of the visitor not being familiar enough with the terms they are actually looking for, but I guess my original thought was if no one is searching for it, does that look odd to Google to have a lot of those pages on a site.

mendel36

3:29 pm on May 16, 2012 (gmt 0)

10+ Year Member



[I suppose in this case it's just a case of the visitor not being familiar enough with the terms they are actually looking for, but I guess my original thought was if no one is searching for it, does that look odd to Google to have a lot of those pages on a site.


I think that is probably a good thought. Internal links and internal search results are not strong enough to make it important enough.

But this makes sense right? It is like a librarian dropping books and magazines because they just are not circulated enough. They are taking up time and space so lets find better resources.

Again, this is not at a knock on the quality of content, but rather just a mathematical signal.


Why not cleave sections of the site that have low to no volume and put them into a dormant library to reintroduce later?

Maybe its a hail mary, but one of the ratios we harp on is the number of urls in our site map that receive regular traffic versus total URLs. If this ratio is below 75% then it shows that we are producing content with very little value.

getcooking

4:21 pm on May 16, 2012 (gmt 0)

10+ Year Member



Again, this is not at a knock on the quality of content, but rather just a mathematical signal.


So, if this is the case: no matter how great the quality of the content, if no one is searching for it Google might still not think it's an important page. Which would mean beefing up any thin pages that just don't generate traffic from search might not help with a Panda recovery.

I also am interested in the ratios you mention. That's something I'm going to also look at.

I'm still torn on how to proceed though. Say I have a thin page (a sub-category page with 4 articles listed).

I see my options as:
a) no index the category but leave it as is for the users that navigate to it rather than search for it but as someone else mentioned, I then reduce my overall potential landing pages [EASIEST]
b) move the articles to the parent category (which is now 4 articles bigger and might already have 100 articles). [SOMEWHAT EASY]
or
c) try to build up the content on the page (introductory text, extra features) but it will still only have 4 articles for the time being (it might change in the future as more articles are added). [NOT EASY]

I'm convinced that these thinner pages (that oddly rank well) are a significant part of my problem. My "thicker" pages that don't have stronger backlink profiles but are for more competitive terms are the ones that seem to have been hit the hardest (ones with strong backlinks are unscathed).

I've read a few articles/studies now that said the key to recovery for them was adding more content. While other members here say they've added content to no avail. What I haven't seen mentioned was the competitiveness of the terms for the pages that the content was added to which is why I started wondering if it could be a factor.

I need to commit to a process on how to proceed soon. I'm just afraid to commit to the wrong one. It seems like the option that is the most work would probably be what Google would insist on <grin>

indyank

5:15 pm on May 16, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think that is probably a good thought. Internal links and internal search results are not strong enough to make it important enough.

But this makes sense right? It is like a librarian dropping books and magazines because they just are not circulated enough. They are taking up time and space so lets find better resources.


Google might be doing this but it just doesn't sound great to me. A library where I couldn't find that book I am looking for is not that good to me. I don't care about how well the book is getting circulated but I am looking for that unknown book because it has got that not so easily found information.

So if this were how the Google algo is functioning, it doesn't sound good to me.

But mendel36, there is a difference between your example and getcooking's problem. You are talking about those books on some popular topics that aren't getting circulated , either because of poor quality or poor marketing. A page or site might be great in quality but if it isn't marketed, it just don't stand a chance against the established and trusted sites. Marketing need good budgets and that is exactly what google wants. This is another good reason for Google to focus on user metrics :)

If I understand getccoking right, his pages are on topics that aren't very popular in terms of google search volume. He also clearly mentions that they do rank well for their keywords. It is just that the traffic to them aren't great and if google were to use a signal like the ration of high traffic pages to medium/low trafficked ones for Panda (which we all know has a sitewide quality score), it just wouldn't make sense.

Rasputin

5:29 pm on May 16, 2012 (gmt 0)

10+ Year Member



I have an interesting example of pages of very low interest and very few visitors on a site that has recovered from panda:

I have a travel site for a country and the writer of all the articles is a resident of the country - well travelled, he is also a university professor of literature. As a result every travel article he sends me is accompanied with a very detailed explanation of the etymology of the place name and a thorough history. Because these are not travel related I post them separately to the main articles.

The number of people who search for the origins and etymology of a place name is, to say the least, small but some of the pages attract interest (and links) from writers, history specialists, wikipedia authors. But just because very few people look at these pages they are still an important reference on their subject - and I am not seeing the site penalised as a consequence.

indyank

3:39 am on May 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Rasputin, Thanks for that relevant feedback.

1) When was the sit hit by panda and when did it recover?
2) Was the recovery over a period of time or sudden?
3) What is the ration of high traffic to low traffic pages for the site?
4) What is the traffic per day/per month like for the site as a whole and what percentage of it comes from your high traffic pages?

Rasputin

4:44 am on May 17, 2012 (gmt 0)

10+ Year Member



- site was in panda from april 11 2011 to april 27 2012
- recovery was sudden and 100% of original traffic recovered
- not sure how to calculate the ratio since traffic is spread across hundreds of pages, there are very few individual pages that get significantly higher volumes. The history / etymology pages get less than the travel pages (although a lot of those are about quite obscure places and get minimal visitors).

The whole site would generally be considered 'low traffic', even out of panda.

Interestingly, our main (high traffic) site (same principle, different destination, no 'history' type pages) had the same period in panda and was worked on continuously for a year to get it out. The smaller site had very little done to improve it except new articles were posted as our contributor sent them (about one a week). So I can't help wondering if the main site would have escaped panda without all the effort, and the year's work played no part in the recovery.

indyank

5:56 am on May 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The whole site would generally be considered 'low traffic', even out of panda.


There lies the answer. I haven't come across an example of a real high traffic site recovering from panda except for those real big content farms that has clout.

Rasputin

6:35 am on May 17, 2012 (gmt 0)

10+ Year Member



As an indication of size, the main travel site I referred to got out of panda and gets about 25k pageviews a day, so presumably still small in terms of what you refer to as a 'high traffic' sites.

indyank

6:57 am on May 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You are right. Even I am not sure what is a high traffic site for the purpose of panda. But yes, I considered site with 1.5 - 2 million PVs and above.

But that is a decent traffic to recover. It could even be that your smaller site recovered because of your main site. Are these two internlinked or under the same webmaster tools account?

Rasputin

7:32 am on May 17, 2012 (gmt 0)

10+ Year Member



They share a webmaster tools, analytics and adsense account so there is no difficulty for G to associate the sites.

Interlinking is a tricky question - currently the main site does link to the other site, and also to a parent site that links to all our other group travel sites (with 'follow' type links).

But during the course of the year I have taken off and added back the links several times. The conundrum is that on the google forums the 'experts' (definitely in inverted commas) often claim sites are penalised because of group interlinking - but when I look at the high traffic brand name groups they are allowed excessive interlinking with no visible penalties.

So really I have no idea whether I should do this or not.

It is possible that when the main site was pandalised the penalty was carried across to the second site, and when it was released from panda the penalty no longer applied, which helped the second site recover. With so many variables it is very hard to know what was important.

Planet13

12:47 pm on May 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@ getcooking

I'm still torn on how to proceed though. Say I have a thin page (a sub-category page with 4 articles listed)...


I would make the STRUCTURE of your site based on what is best for your users.

Then I would use noindex / robots.txt / canonical link tags to prevent indexing of what would be considered thin content.

If your users would find a page with only 4 articles on it helpful, but google would find it thin, then create that page and noindex it until you get enough other content on there that it wouldn't be thin.

In the meantime, if it is is truly a useful page, then maybe despite being thin, it will start getting a lot of social love. If enough chrome / android users are visiting and loving that page, then google might smile upon it, too (assuming that google analyzes user data from chrome and android users).

It sounds like you have a pretty large number of categories and articles, so you might need different forms of faceted navigation - which usually leads to "duplicate" content. So again I would look into making your content as easy to find as possible for your users via different search / navigation methods and be very diligent about what you DON'T want allowed in the search engines.

RP_Joe

2:38 am on May 23, 2012 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have a site with 250 or so pages. At least 100 are low traffic. The the content is good, original and the bounce rate on those low traffic pages is very low.
Every month for the last year, Google sends about 10-15% more traffic.
Its not necessarily good traffic, by that I mean its targeted traffic but most do not have commercial intent. But I realize thats the price I have to pay to get people of Commercial Intent. I have to build a site that helps Google with lots of content and then I get access to the buyers.

I am not sure Google has publicly stated it in those terms, but thats what I believe.

tedster

3:07 am on May 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RP_Joe, I think you have a healthy sense of what's right here. If the very few visitors who do a search and visit the page engage with it well, you can relax.

If, from all signs and signals, the visitors seem to appreciate the content, then it surely isn't shallow content and it's not a Panda risk in my books - it's long tail content, well done.

Planet13

1:48 pm on May 23, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RP_Joe

Every month for the last year, Google sends about 10-15% more traffic.


My site was like that all of last year. Similar "non-commerce" traffic.

The different iterations of Panda all led to an INCREASE in traffic to my site. I was seeing a 100% increase in traffic in February of 2012 over the same month in 2011.

Things were going great - for a while...

My site lost about 30% of traffic in the March 23rd 2012 Panda, and I lost another large portion of traffic RIGHT AFTER, Penguin (on April 26th, not on the 24th).
This 54 message thread spans 2 pages: 54