homepage Welcome to WebmasterWorld Guest from 54.204.215.209
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Panda fix - content size doesn't matter
Whitey




msg:4332284
 10:37 pm on Jun 28, 2011 (gmt 0)

Dude, it sounds like you've read too many tin foil hat "SEO" articles. As long as your site has worthwhile, original content on it (i.e. it's not just made to put AdSense on it), there's not a golden "text-to-ads ratio" or a word limit for ranking. I worry that you're just looking at the trees and not seeing the forest. Optimizing your site for search isn't about counting the words on a page, it's about making sure that you have useful, usable content, and then making that content accessible to search engines.

To put this another way: I have never ever seen a site where everything was great and it would have ranked well except its articles were only 200 words long. That's just... not the way the algorithm looks at stuff. If your site isn't ranking well, the cause must be elsewhere Susan Moskwa Google Employee [google.com...]

 

deadsea




msg:4332339
 1:44 am on Jun 29, 2011 (gmt 0)

So this person from Google says that words per article and text-to-ad ratios are not part of the Google algorithm. What does that leave that Google could be using to measure sites for shallow content.

My guesses:
1) Content farms have lower user engagement metrics -- people don't find the articles helpful, come back to the SERPs, and click on something else.
2) Content farms never update content, they only add content. You can tell content farms apart from news organizations that also work this way because the content farms write about "evergreen" topics that get hits month after month.
3) Content farms have many similar articles all focused on various high volume ways of phrasing the same query. Sites with deeper content tend to have fewer articles on a given subject, and titles that are not targeted at a high volume keyword phrase for the subject.
4) Content farms don't have pages that AREN'T focused on a keyword. Every page they have is targeted to some phrase with search volume.
5) Content farms have no overall site focus, but write about anything at all. You can tell this apart from wikipedia, maybe?

tedster




msg:4332345
 2:11 am on Jun 29, 2011 (gmt 0)

Maybe they are also processing the content directly - reading level, grammar, spelling, average sentence length, number of Latin-derived words compared to Anglo-Saxon words, number of reptitions of the query phrase (remember them saying that Panda 2.0 went further into the "long tail"?) and perhpas even more. Maybe there is some attempt to DIRECTLY measure content, in addition to looking for secondary and supporting signals.

Biswanath Panda is an expert in large-scale decision tree processing. The Panda algo would be complex because of two factors at least:

1. The construction program can range over a wide spectrum of data points - things I'm sure we never even considered in our wildest dreams. Google collects a LOT more data than they are actively using.

2. The decision tree could have such complex if-then looping logic that we'd also be very challenged to get the big picture.

brinked




msg:4332346
 2:37 am on Jun 29, 2011 (gmt 0)

why would content size matter? We have two different arguments here and we need to be meeting somewhere in the middle.

We start off with a discussion from a simple idea that content size may be a cause to be pandalised. Then on the opposite end of that you have tedster with such a complex idea (although much more realistic). Reading tedsters post makes me think too much for this time of night that it gives me a headache (no offense, Ted).

Instead of trying to get down to every little detail that panda might be about, why not go with a high quality format? Why not study high quality websites that have not faltered to panda or have been improved by panda, and just get a feeling for what google wants based on these winners.

You can study your opponent till the end of time, but you will never have the secret sauce, so why not work on your game as much as you can since that is all you can really control?

walkman




msg:4332357
 2:54 am on Jun 29, 2011 (gmt 0)

She mentions Donna's post on SeoRoundtable on Panda and said this is not about Panda.

Tedster, panda may measure all the above but anyone with a spell checker and a high diploma will pass that. How can they tell that the info is "good"? That's the multi-billion dollar question.

It all depends how Google classifies you IMO, informational, forum, local, e-commerce...or any of the many subdirs. I would not recommend starting a site with just stubs, unless you are very famous or move that to junk.yahoo.com. Then it may be on front page.

tedster




msg:4332369
 3:15 am on Jun 29, 2011 (gmt 0)

It all depends how Google classifies you IMO, informational, forum, local, e-commerce...or any of the many subdirs.

I as sure of that as I am of anything - it's an early branch on the tree.

We start off with a discussion from a simple idea that content size may be a cause to be pandalised.

And Susan Moskwa essentially de-bunked that once and for all. She is a pretty well-known Google figure, and I think she just did everyone a favor who reads her comment: the advice "Don't start hacking your site to bits just because not every page is over 400 words."

Whitey




msg:4332377
 3:33 am on Jun 29, 2011 (gmt 0)

It all depends how Google classifies you IMO, informational, forum, local, e-commerce

Spoke to a Google rep years ago who said start your site with unique content and build up. I doubt if anything's changed. At least we have confirmation that it's quality over quantity.

And quality now brings into play other metrics. Truly, I'm seeing large sites with less than 125 words of content on a page, but stunning usability sail through this update, actually increasing their traffic through Panda.

Many folks don't want to read lot's of content and sometimes you don't need it for ranking. Often it could be a negative signal.

walkman




msg:4332383
 4:05 am on Jun 29, 2011 (gmt 0)

Funny, I was just checking Alexa's (yes I know) clickstream data. Almost the same % that come from Google go back to Google, even for Gogle darlings like Quora (200 words is a stretch there:)), news.com, cbs.com. Daniweb.com is within the range as well. There's usually less than 5% difference in the sites I tested but [alexa.com...] eHow has better metrics, less returned to Google, but almost 100% of CNN visitors that came from Google go back to Google. MSNBC apparently beats them so far, 2% more go to Google, as MSN sends them the bulk of traffic :). I know it's alexa but presumably they measure all the same and large sites have enough Alexa user visits. We also have to find what go back to Google means.

Do I get cookie or what?

That thin content doesn't matter I can testify to it, at least for some type of sites. I'm back--month two--to around 70% of pre-Panda revenue, in part to two thanks to 2 such sites that keep gaining as my Pandalized site keeps losing.

almighty monkey




msg:4332440
 8:05 am on Jun 29, 2011 (gmt 0)

I thought the point of Panda was that it's less about content size, more about content/crap ratio.

So, if you have, say, 3 sentances of unique content, but no adverts, you don't get mulled by the panda. If you have 5 paragraphs of decent content, but a completely ridiculous amount of adverts burying it, you do.

Does that match other peoples experience? None of my clients have been affected by Panda (Mainly broucher websites, so no ads or thin copy) so I don't really have much of a data set to go on.

Tin foil hat SEO is a fantastic term, for the record.

HuskyPup




msg:4332462
 10:03 am on Jun 29, 2011 (gmt 0)

there's not a golden "text-to-ads ratio" or a word limit for ranking


I realised that several weeks back when one of my sites started climbing the G.co.uk SERPs and many pages are now top 3. All these pages are constructed the same with product name in the titlebar, meta description, on-page product name and the product 500 x 500 image, nothing else.

It's a retail widget site purely for simple informational purposes, it's not designed to do anything else.

At the same time I have a trade widget site which I have been experimenting with during Panda. These product pages had some informational text on them and were pandalised but not heavily. Subsequently I have beefed-up the information on them, tightened-up the SEO and ALL those pages are now back in the top 3 of G.co.uk.

I have my own theory as to what, and how, Panda is targetting specific sections of my two biggest sites and I'll be experimenting with that soon however I know, for me, that it's nothing to do with perceived crap, ad blocks or site construction.

freejung




msg:4332747
 9:30 pm on Jun 29, 2011 (gmt 0)

So this person from Google says that words per article and text-to-ad ratios are not part of the Google algorithm.

She most certainly did not say that. She said that can't be the only reason this guy is not ranking. That's not remotely the same thing.

Google is almost certainly measuring these things. What she's saying is that these factors by themselves will not cause you to be penalized or pandalized. These signals might still be used in conjunction with other signals to measure site quality.

The real meat of what she's saying is not about whether particular factors are part of the algo, it's a much broader point, which she clarifies in a later post:

once you've started counting the words on your pages and making changes "for SEO purposes, rather than user information," you've missed the point


If you're trying to cover a super complex in-depth topic in 200 words, you'd better make your article longer, not because you will be penalized for having too low a word count, but because you are not providing adequate information about the subject. For some other topic, 200 words might be perfectly adequate. In some cases it might be way too much -- maybe the best possible answer is a single short sentence.

The point is that this level of micro-analysis is entirely the wrong approach to SEO, in Google's opinion.

"...there are too many damn machines around here. We're all missing The Big Picture." - Peter O'Toole in "Creator"

The decision tree could have such complex if-then looping logic that we'd also be very challenged to get the big picture.


Even worse, it could be exhibiting emergent properties.

It is entirely possible that nobody, not even within Google, actually knows how the Panda algo works. Or if they do, it's because they can measure its behavior directly, not because they explicitly designed it to behave that way.

[edited by: freejung at 9:49 pm (utc) on Jun 29, 2011]

freejung




msg:4332753
 9:38 pm on Jun 29, 2011 (gmt 0)

Also, from my own experience: by far the most popular single page on my site, a page that was not pandalized at all despite most of the site taking a horrible beating... that page consists of five words (three if you don't count connectors) and an image.

It's just a damn good image. And the words simply state what it's an image of. That's all it needs, and it's doing just fine.

Whitey




msg:4332812
 12:06 am on Jun 30, 2011 (gmt 0)

That's all it needs, and it's doing just fine

I'm seeing this success replicated on large sites, where traffic is increasing, as i said earlier.

This Panda algo is clearly not influenced "alone" by the content size. I'm even questioning the quality aspect of content which may not be enough.

What is really a worry is that many of the sites that i see ranking, do so for techniques that Google is going to slap soon anyway. And for many others, very few could be confident about how they will rank for ever. So what's left.

At least content size can be removed from the equation.

tedster




msg:4332843
 1:37 am on Jun 30, 2011 (gmt 0)

What is really a worry is that many of the sites that i see ranking, do so for techniques that Google is going to slap soon

Same as it ever was, eh?

Whitey




msg:4332849
 1:44 am on Jun 30, 2011 (gmt 0)

It's just that they seem more visible now. It's like Google removed one layer, to expose another .... but sure, as it always was perhaps :)

potentialgeek




msg:4333202
 6:25 pm on Jun 30, 2011 (gmt 0)

I just got done reviewing all my sites - about 20 - looking at data from January 1, 2011 to today.

The only sites which didn't get Pandalized had thicker content. The thin ones which had some pages with one line got nailed; whereas the sites with a paragraph or two, or three or more, survived.

I'm pretty convinced size is a significant issue. Not the only one, but it's clearly one.

Now the question is whether just because anorexic pages can get you Pandalized whether adding meat will get you un-Pandalized.

If memory serves, I think it was Google employee John Mueller in the Google Support forum who told one webmaster adding a few lines would help deal with Panda, but it sounds as if some webmasters here have already tried this approach without success.

In any case I might try it myself - at least on a few sites - just to see what happens.

JohnMu:

Our algorithms don't count characters, but they try to find unique content in a sense. I wouldn't worry too much about the length, I'd just make sure that you have pages about your software on your site that do not contain just the content that you're syndicating via the XML file. Adding a sentence or two is one way to do this; even better would be to make it completely unique.

[edited by: potentialgeek at 7:21 pm (utc) on Jun 30, 2011]

c41lum




msg:4333209
 6:35 pm on Jun 30, 2011 (gmt 0)

@potentialgeek I agree, I think size does matter unless you have thick useful content on every page.

Sgt_Kickaxe




msg:4333222
 7:01 pm on Jun 30, 2011 (gmt 0)

Size doesn't matter. I can show you pages ranked #1 that don't have a single word of unique content but they do have a unique title and unique url with MINIMAL duplicate content. Skinny, but not duplicate. A flicker image page with no comments for example... you get the idea.

walkman




msg:4333251
 7:47 pm on Jun 30, 2011 (gmt 0)

Adding a sentence or two is one way to do this; even better would be to make it completely unique.

I really hope you aren't relying on this. I can show you where Google's advice on dupes /near-dupes was to 'let us sort it out'...and then came Panda.
What individual engineers said yesterday or a year ago can easily mean at least a 5+ month open-ended Panda vacation. They don't even include a return ticket.

suggy




msg:4333268
 8:15 pm on Jun 30, 2011 (gmt 0)

Adding content for the sake of beating the Panda alone is definitely not the answer!

freejung




msg:4333291
 9:00 pm on Jun 30, 2011 (gmt 0)

Adding a sentence or two is one way to do this; even better would be to make it completely unique

Google's public spokespeople tend to word things very carefully, and you have to exercise fine reading comprehension skills to understand what they mean.

In this case, JohnMu is not telling the webmaster that having more content would help. He is saying that having unique content might help. One way to achieve unique content is to add more words to the duplicate content you already have. Another way is to remove the duplicate content and replace it with something unique. Just going off of what was said in the comment, it is entirely possible that the replacement text could actually be shorter than the original and nonetheless rank better because it is unique.

It's important not to read things in to these kind of statements that they are not in fact saying. Also, of course, keep in mind that the statements may not be accurate, but even if they are accurate, that only works if you understand what the statement actually means.

freejung




msg:4333299
 9:13 pm on Jun 30, 2011 (gmt 0)

Furthermore, all you have to do is bold a different part of the comment and the meaning becomes more apparent, and is also consistent with the OP:

Our algorithms don't count characters, but they try to find unique content in a sense. I wouldn't worry too much about the length, I'd just make sure that you have pages about your software on your site that do not contain just the content that you're syndicating via the XML file. Adding a sentence or two is one way to do this; even better would be to make it completely unique.

walkman




msg:4333310
 9:34 pm on Jun 30, 2011 (gmt 0)

freejung, until people that improved their sites start coming out of Panda in masse, it's a theoretical discussion. Possibly a dated one. We know what one G engineer said in 2010, we don't really know what Panda is about since all kinds of sites have been destroyed, not just article sites. Of course what applies to one site doesn't apply to the other, further muddling the water.

brinked




msg:4333313
 9:55 pm on Jun 30, 2011 (gmt 0)

I think we all need to understand what exactly unique content is. When google uses the word unique, I do not think they are referring to a non copied article/story. I think google is starting to favor sites that write original stories rather than write what everyone else is writing about. For example if Lindsey Lohan was arrested again, every site on the web will be writing their own take of the story. Each story will be "unique" and not a copy but the reader will generally get the same information from all of them.

Now a website that has content that nobody else has may start to be favored over the others who are just publishing their take of the same story circulating all over the media. A lot of webmasters blog about popular stories because they are just that "popular" and they want to get in on the action because there is a large audience for it.

I am not saying websites should stop writing about these stories, but rather focus on adding totally unique stories that nobody else will be writing about. What I am saying is, instead of following the trends, start your own.

Whitey




msg:4333322
 10:34 pm on Jun 30, 2011 (gmt 0)

but rather focus on adding totally unique stories that nobody else will be writing about

Good points, and i agree with others that sites with little unique content are ranking well. But how do you do vary a product description or a common infoirmation subject when there are 1,000's doing the same thing. I can't imagine Google is enthusiastic with the same page titles either.

whatson




msg:4333354
 11:46 pm on Jun 30, 2011 (gmt 0)

So, Panda is likely not due to lack of content or excess of ads. That really only leaves duplicate content issues.

Duplicate content is copying content from another site, and let's face it, this is unfair and should be cracked down on. And from Google's perspective they want a mixture of content in their serps, not the same content from 10 different sites.
You all know if you have copied content from other sites or not. If you haven't corrected that yet, then you are not going to have any chance of being unpandalized.

Sure, there is the whole social popularity and user-behavior factors, but I personally don't buy into these - at least not as a Panda factor.

freejung




msg:4333366
 12:15 am on Jul 1, 2011 (gmt 0)

until people that improved their sites start coming out of Panda in masse, it's a theoretical discussion

True dat, I'm just quibbling about the meaning of the original comments.

I really don't think size is a direct factor, but it could be a context-dependent indirect factor if you are in a situation where you really ought to have more content on the page.

That really only leaves duplicate content issues.

Not really. It also leaves user behavior metrics, technical issues, semantic analysis, and lots of other possibilities. It is likely that the Panda algo was allowed to range over a very large multidimensional dataset. Sorry, but figuring it out is just not going to be that simple.

Having said that, I'm definitely working to enrich my content and resolve possible duplication issues, as it seems likely that those are important aspects of the algo.

brinked




msg:4333374
 12:42 am on Jul 1, 2011 (gmt 0)

So, Panda is likely not due to lack of content or excess of ads. That really only leaves duplicate content issues.


That's not true. Panda is not just about 1 thing. I think given the fact that many sites that were effected had an excessive amount of ads, plus google changing their adsense best practice guidelines as well as them pointing out in their blog post: [googlewebmastercentral.blogspot.com...]

"Does this article have an excessive amount of ads that distract from or interfere with the main content?"

I think it is very safe to say that sites that give priority positioning to ads over content will be at risk of getting hit by panda.

If you read their list of what makes a quality site, it pretty much lines up with my theory. Take a look at these points:

Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?

----Lindsay Lohan was locked up how many times?

Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

----Are you producing articles that are all over the web already such as the latest gossip such as Lindsay Lohan getting arrested or are you writing articles about other celebs that dont get as much coverage?


Does the article provide original content or information, original reporting, original research, or original analysis?

----You read a news article or watched tv on a hot topic story. You decide to write about it because its popular news such as Lindsay Lohan getting arrested for the 8th time. You do not present any new facts about the story, instead you point out that same facts as everyone else who has blogged about this story except for maybe throwing in your own personal opinion.


Does the page provide substantial value when compared to other pages in search results?

-----Is your Lindsay Lohan article really any better than the other 200 websites writing about Lindsay Lohan getting arrested?


Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?

-----You're not the only one writing about Lindsay Lohan getting arrested.


Does this article contain insightful analysis or interesting information that is beyond obvious?

----Besides your opinion, do you point any new facts about Lindsay Lohans arrest that are not already found on all the other gossip columns?


So as we can see I think it's pretty clear that Lindsay Lohan is to blame for google needing to release panda.

walkman




msg:4333384
 1:27 am on Jul 1, 2011 (gmt 0)

Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?

----How to pick your nose on Monday mornings. Related article: How to pick it on Tuesday afternoons.

Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

----eHow algorithm that outsmarted Google by seeing what people search for.


Does the article provide original content or information, original reporting, original research, or original analysis?

----$8 article that must be done in 10-15 minutes.

Does the page provide substantial value when compared to other pages in search results?

-----Self explained since G ads need to be clicked to find the answer.


Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites donít get as much attention or care?

-----eHow & co paying people peanuts to create crap


Does this article contain insightful analysis or interesting information that is beyond obvious?

----All the insight and info $8 and 10 minutes can buy.


So as we can see I think it's pretty clear that eHow and other true content farms are to blame for Google needing to release panda that caught so many other innocent sites. Ironically, eHow actually got a traffic increase after the first Panda, giving Google a black eye.

whatson




msg:4333396
 2:23 am on Jul 1, 2011 (gmt 0)

Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?

- have you done your research by gathering content through other web sites?

Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

- give the visitors the content they want, which is what they are searching for.

Does the page provide substantial value when compared to other pages in search results?

- Make it the best most useful web site you can think of. Become as much an authoritative on something you can.

Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites donít get as much attention or care?

- is the content creator a credible source on the subject?

Does this article contain insightful analysis or interesting information that is beyond obvious?

- Writing an article without using many other resource, preferably first hand experience.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved