|Definition of thin content. What is it exactly?|
|If you see this message on the Manual Actions page, it means that Google has detected low-quality pages or shallow pages (such as thin affiliate pages, cookie-cutter sites, doorway pages, automatically generated content, or copied content) on your site. These techniques donít provide users with substantially unique or valuable content, and are in violation of our Webmaster Guidelines. [support.google.com...] |
|some webmasters attempt to improve their pagesí ranking and attract visitors by creating pages with many words but little or no authentic content. Google will take action against domains that try to rank more highly by just showing scraped or other cookie-cutter pages that don't add substantial value [support.google.com...] |
Fair enough Google.
To handle this subjectively with analytics, I guess folks could quickly identify high bounce rates and time on site. Then form a view whether content is "rich" or "not" , sufficiently differentiated, or not.
Easy .... or not ?
In reality it would seem to be a more complex issue to both define and rectify, which is probably the reason itís not been talked about in sufficient objective detail regarding site quality issues. Indeed, is "quality" and "thin" an overlapping term or completely separate?
It would be great to listen to how webmasters are approaching it, and what "left field" creative inputs could be shared here on what I might have missed.
In the context of "add value" , can anyone delve deep and come up with some inputs that might help others to avoid thin content.
If the goal is to do better against a machine learning algorithm with an unknown set of criteria, I suppose the best you can do is to look at what is ranking well.
Take the top 10 results across a few different, but related keyword searches. (cheap widgets, affordable widgets, widgets on sale).
Identify what attributes they have in common, and see if there's a sweet spot. Try to stick to things that a machine learning algorithm would have access to.
- Content length.
- Semantically related terms
- Known author / authorship, perhaps previously published on the topic?
- Outbound links to authorities on the topic (assuming you can find non-competing things to link to?)
- Richness of html markup. For example, are the ranking pages full of things like data tables, <em> style quotes, bulleted lists, etc?
- Flesch-Kincaid Reading Score. Perhaps certain niches would be expected to have higher score (academic articles), others expected to be mid-level (ecommerce), others expected to be lower (niches with children as the primary audience)
- Internal links. Is the content contextually linked from other content on the site that's in the same topic area?
- etc. Google's "quality rating guidelines" might be a good place to find more.
I'll admit, though, that in some niches, the only thing that seems to matter is authority, even over relevance. Amazon can rank #1 for "red widget", with a single "blue widget" page...even though the item's not the right color, and has been out of stock for over a year :)
Most times if it doesn't take any thought or time to create the page, or is automagically built from a data query, or is obvious cut and paste, scraped, or near useless except for hanging ads, then that is thin content.
I think that many who complain about a lost of rank, or even a penalty, already know why that happened.
|It would be great to listen to how webmasters are approaching it |
I have sites with lots of thin content (all but one of which do very well in Google) I try to make sure it's *useful* thin content, and I try to compile it, arrange it and display it in a more useful way than can be found elsewhere. (No extra clicks, important information on top, easily found from the home page, quick loading, available on multiple devices, sorted in logical ways, etc)
|Identify what attributes they have in common, and see if there's a sweet spot. Try to stick to things that a machine learning algorithm would have access to. |
One thing that I see amongst large, high ranking sites is absolutely thin content ranking better than sites with lot's of written content.
Lot's of text seems to be a turn off to users - folks want their information quickly and to be able to interact fast.
|I try to make sure it's *useful* thin content |
This is compelling. Thin can be *useful*. Less is more.
Anyone with similar thoughts?
|Anyone with similar thoughts? |
This may go slightly off-topic but short.
Truthfully, yes, however I'm NOT going to tell anyone in a public forum since it does nobody any favours giving the blackhatters even more armoury than they already have and mostly provided by some of us, the white hatters!
It is very easy to disect Google's algorithmic preferences, the question has to be, for those who do not understand it, why is this supposedly complex algorithm so freaking easy to "play"?
Thin with Google, for me, is a no-brainer, they just love that crap and even more so if it's on Blogger...guaranteed rankings...way to go G!
Surely everyone realises that the biggest blackhatter itself is Google? Every piece of "seo advice" it gives IT blatantly ignores and abuses. It hotlinks copyrighted images, it has nothing but 100% of advertising above the fold, it rapes every website in the world with absolutely no acknowledgement
Honestly, it's pathetic...in the UK we say Jobsworth, now it's Googlesworth! I'm stopping writing now..!
|Lot's of text seems to be a turn off to users - folks want their information quickly and to be able to interact fast. |
It depends on the subject, the content, and the user. According to Google Analytics, more than 30 percent of our site's pageviews come from readers who look at 20+ pages, and 20 percent come from readers who stay on the site for 30 minutes or longer. (Granted, most people don't hang around the site that long, but those numbers do suggest that "Web" and "reading" are concepts that can co-exist.)
"Thin content" has its place, of course. If I want to know the temperature in London, two digits, a degree mark, and a "C" or "F" are good enough for me.
|I try to make sure it's *useful* thin content |
@Netmeg - do you consider scripted content with wildcard insertions to be ok?
[ I'm seeing sites with this outrank sites with tons of unique content - probably Google discounts unique content they consider adds no value ].
|@Netmeg - do you consider scripted content with wildcard insertions to be ok? |
Ok if I'm searching and it answers whatever I'm searching for.
But as a publisher/webmaster, I probably wouldn't put it out there that way myself. Because on the whole, I don't think that model is really sustainable.
And that's a big issue with thin content, even if it's useful (maybe especially if it's useful) That's the type of thing that Google is more likely to suck up and spit out in the Knowledge Graph. So while it might work now, in a year or two... who knows?
I'd like to appreciate the difference between absolutely thin content and relatively thin content.
For example, let's say that there are 50 pages on widgets. You create a page that has very little information about the widget. It's obviously thin because there are 49 other pages that have more information.
Now let's say that you create a page about an obscure topic, maybe a rare widget only found in Borneo. No one else has information on this widget, you did all the research on it. Should you get a hit for having a "thin" page?
From all accounts I've heard, Google will penalize your entire site (or at least a section of it) due to thin content. Or that's at least what people believe. I'd like to know if that is true, or if only the thin content gets penalized. Or is relatively unique content not penalized because Google recognizes that it is unique.
|Now let's say that you create a page about an obscure topic, maybe a rare widget only found in Borneo. No one else has information on this widget, you did all the research on it. Should you get a hit for having a "thin" page? |
Probably depends on how USEFUL that widget info found in Borneo really is. I mean I can crank out thousands of pages with 25 or 50 words on things that very few people search on or engage with or share, or I can crank out ONE page with 25 or 50 words on something that people really need or need to know, and find some way to make it more attractive and useful than the other gazillion people who've written 25 or 50 words about it, and I'll probably do better with the latter than the former.
If you want a line drawn in the sand where the thin content is not too thick and not too thin but just right, though - there isn't one.
|From all accounts I've heard, Google will penalize your entire site (or at least a section of it) due to thin content. Or that's at least what people believe. I'd like to know if that is true, or if only the thin content gets penalized. Or is relatively unique content not penalized because Google recognizes that it is unique. |
Again - it's not just uniqueness, it's also *usefulness*. And it's not a penalty (I know I know) as much as a failure to promote.
Personally I think Google's all about percentages, and you have to look at your site in terms of percentages too - and again, it's not hard and fast numbers, but your percentage of USEFUL should be higher than your percentage of thin, of filler, of navigation, of unique, etc etc.
In 2014, useful is the new black.
Here's one observation about "thin content", at least as it relates to my site.
My way was to stick with the "facts and just the facts" please.
So I've got a bunch of photo pages, each with a few "facts" about the specific widget in the photo.
The problem with that is that facts are facts and so any site can use them, and they do, in droves. Which means little if any unique content on my pages except for the photos, which I've traveled tens of thousands of miles in the last 13+ years to shoot.
I deliberately stuck to "facts" because I really didn't and don't want to get into expressing opinions about these widgets beyond something like "Here's a cool widget", (and how many times can you say that?).
I wasn't trying to build an all encompassing encyclopedia of widgets, just a simple visual history with a few facts thrown in to add a little interest.
The result is I'm apparently fully cooked and ready to be tossed out with the trash :)
When it comes to obscure widgets, I've got photos of quite a few. For example when I photographed one of the first really obscure widgets I saw, the serps returned 3 listings, yes that's THREE, I just did the search now and it say "About 277,000 results (0.25 seconds)". To put that in perspective, at the last national convention of owners of these widgets, NINE widgets showed up, NINE, plus a couple in the widget museum where the event was held.
On top of that, more than one copyright violator has built a website with nothing but my photos, oh joy!