homepage Welcome to WebmasterWorld Guest from 54.163.72.86
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
What is "poor content" by Google definition?
Dantes100




msg:4347680
 11:55 am on Aug 4, 2011 (gmt 0)

I'm trying to identify the "poor" content on my site to either rewrite or noindex it.

But how do I do that? How do I know what Google considers as "poor"?

My idea is to monitor which keywords lost most in ranking after the pandalization and identify the corresponding pages.

What do you think?

Thanks,
Eddie

 

tedster




msg:4347702
 1:11 pm on Aug 4, 2011 (gmt 0)

I think that's an excellent step to take - it's quantitative and data based, rather than merely qualitative or subjective opinion.

However, in some cases we've heard about here, the "contamination" effect (pages with a poor rating also drag down their neighbors) these quantitative results are not always clear.

I've recently been re-reading the in-depth article from Amit Singal that goes into the qualitative judgment that needs to accompany the pure data-driven analysis. Here are just a few of his bullet point questions:

  • Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
  • Would you be comfortable giving your credit card information to this site?
  • Does this article have spelling, stylistic, or factual errors?
  • Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

    [googlewebmastercentral.blogspot.com...]

  • jmccormac




    msg:4347710
     1:23 pm on Aug 4, 2011 (gmt 0)

    And now for a contrarian view: What the hell would Google know about creating a high quality website when all it does is scrape and repackage the work of others? Trying to figure out what mangling Google did with its index is wasting time. Concentrate on generating high quality content that users will like and pass on to others. Make sure your site is sticky enough in terms of interest that users will want to return.

    Regards...jmcc

    tedster




    msg:4347716
     1:28 pm on Aug 4, 2011 (gmt 0)

    That doesn't sound contrarian to me at all - it sounds exactly like the fourth bullet point I quoted above.

    And I'd say you're right - don't try to "chase the algo." But do notice when then the algo bites you and get serious. I think getting an outside opinion can be very useful. Some people have been following "SEO formulas" for so long they've lost touch with reader-focused content.

    Rasputin




    msg:4347730
     1:56 pm on Aug 4, 2011 (gmt 0)

    Since a page being penalised doesn't necessarily mean there is a problem with that page but somewhere within the site as a whole I'm not convinced that a keyword based approach is most efficient.

    My own approach is to run reports that identify pages with less than 250 words and also pages which have both a high bounce rate ('high' being relative to other pages on the same site) AND a short 'average time on page' (I doubt if bounce rate alone is sufficient). And then completely rewriting those pages.

    I'm taking this approach for the simple reason that pages still doing well on my site tend to have a lot of content, a low bounce rate, and a reasonably high time on site. But I've only just started and it's a long task (3-6 months to rewrite them all properly) so I can't say if it will work yet.

    After three months of self-denial about having poor quality pages I am now coming round to the idea that some of these pages, perhaps 20% of the site, might not be the best quality after all - I think I forgot how little effort I sometimes made a few years ago in an effort to get lots of pages online (at that time they ranked well with little effort due to site authority) and the pages are still there.)

    jmccormac




    msg:4347732
     2:09 pm on Aug 4, 2011 (gmt 0)

    Possibly but the important point that is not in #4 is that users will begin to generate traffic.

    As part of my work, I measure and categorise web usage over approximately 800K websites each month (about 300K Irish hosted websites and 500K .co websites) and the numbers of deserted cookie cutter sites is rising. Most of them are immediately flagged as duplicate content by the algorithms. People still forget to change the title text on their blogs so that the default "Welcome To The Front Page" or the WP equivalent. Real content is getting more difficult to find.

    Google is weak on ccTLD domains in that it has to rely on its blind crawling to detect new websites. It has probably integrated data from Adsense and Analytics but most new websites when they appear have no inbound links and are invisible to search engines. There's also a strange trend in sites dropping keyword metadata and concentrating on the description meta data. Some of this might be down to the rumour that Google no longer uses keywords in its algorithms but it could be a lost opportunity for site owners.

    I agree with the outside opinion. This is why books have editors and newspapers sub-editors. Even asking some friends to look over the content might show up errors.

    Having glanced over that article again, I now know why Google has Wikipedia pages so high in its SERPS. :)

    Regards...jmcc

    Planet13




    msg:4347805
     4:51 pm on Aug 4, 2011 (gmt 0)

    # Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?


    I have a big problem with this when it comes to a site like ehow still continuing to rank well for so many terms. They often have nearly identical content on several pages with just minor modifications to some of the keywords, or just rearranging the order in which the information was presented.

    I mean, an ehow page that cites (and takes most of it content from) another page will often outrank that page. Despite the fact that the other page may also be cited by other sources (wikipedia, university pages).

    So I think it is important to realize that OUR definition of "overlapping" or "redundant" might not be the same as google's definition.

    aristotle




    msg:4347808
     4:56 pm on Aug 4, 2011 (gmt 0)

    Dantes100 -
    In my opinion "poor content" isn't the real issue. Instead, I think that Panda was mainly designed to target sites with "medium-quality content", which is what you find on most article sites and content farms. Here is a summary of how I would break it down:

    -- LOW QUALITY and SPAM. There is a lot of this on the web, but the Google algorithm had already demoted most of it BEFORE Panda was implemented.

    -- MEDIUM-QUALITY CONTENT. There is also a lot of this on the web. I believe that this is Panda's main target, because Google doesn't want medium-quality at the top of the SERPs for high-traffic search terms. So Panda is designed to push it down to page 2 or even lower. This is why content farms and sites with similar characteristics were affected. Because their content is only medium-quality, Google doesn't think it deserves to be at the top of the SERPs for competitive terms.

    -- HIGH QUALITY CONTENT. In my opinion there isn't much high quality content on the web, probably less than 1% of all pages. This is because most people aren't willing to do the research and take the time to produce true quality. I think most high-quality pages escaped Panda, partly because they usually aren't found on content farms and similar sites.

    So Dantes100, for these reasons, I don't think you should use the term "poor content" when discussing Panda.

    outland88




    msg:4347915
     7:53 pm on Aug 4, 2011 (gmt 0)

    Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?


    What Singal isnít mentioning is Google is taking this to the extreme more than people imagine. If some of your pages even graze upon the subject matter of another page the tallied points against you add up. Thatís the reason E-How is making a recovery now. Theyíre eliminating low traffic pages and pages dealing with a singular subject to much. Plus theyíre killing a lot of internal interlinking. The problem though is sites that deal in one subject. Itís very easy for letís say an acne products site to overlap content unless Google gives it a brand pass.

    And now for a contrarian view: What the hell would Google know about creating a high quality website


    You gotta love this comment because itís so true. You know this after a few years in Adwords. Simplistically all Google is doing with Panda is just recognizing what it feels is duplicate content and devaluing sites. As Aristole says it does seem to target mid-level sites. The problem is weaker content emerges because the site owner usually has so little knowledge of the subject they canít repeat themselves. I also love Tedster's explanation of it at one time as "the thinnest of gruel"

    Planet13




    msg:4347949
     9:10 pm on Aug 4, 2011 (gmt 0)

    Thatís the reason E-How is making a recovery now.

    I did not realize they were hit by Panda 2. I remember them sailing through Panda 1, but I guess google laid the smack down wiith Panda Deux.

    whatson




    msg:4347958
     9:32 pm on Aug 4, 2011 (gmt 0)

    I don't think they did get hit, how do you know that?

    Planet13




    msg:4347961
     9:40 pm on Aug 4, 2011 (gmt 0)

    I don't think they did get hit, how do you know that?


    I just did a search on "did panda affect ehow" and found a few sites showing the sistrex data chart.

    they all mention Panda 2, so I don't know if subsequent updates affected them or not.

    outland88




    msg:4347964
     9:48 pm on Aug 4, 2011 (gmt 0)

    I thought many already knew that. [searchengineland.com...]

    It's a little obvious these fellows knew how to fix things with Panda and are coming back rather well. Its very good work.

    [edited by: outland88 at 9:54 pm (utc) on Aug 4, 2011]

    suggy




    msg:4347966
     9:48 pm on Aug 4, 2011 (gmt 0)

    aristotle -- you missed one...


    NO QUALITY CONTENT
    Gives blank pages a free ride to the top of the SERPS because there's nothing to find fault with!

    I have seen this several times in my niche, including:-

    Gone aways -- pages that literally have the just standard navigation, a title and a big empty space with a notice "We are currently relaunching our store" front an center. One such example I know off is ranking very well and I know they were told to pull the whole store down in back in Feb by the supplier.

    Stub pages - from big brands. Again nothing meaningful to see.

    Abandoned exact match domains with two sentences on a page and a big Adsense block.

    Lazy product pages, with just the product name, image, price and add to cart. The owners couldn't be arsed to write any content and, why bother, when it just gets you pandalised?!

    That's what bugs me most about Panda is that no quality isn't see as low quality!

    outland88




    msg:4347972
     10:06 pm on Aug 4, 2011 (gmt 0)

    I knew that was happening Suggy. Last night I saw three pages like you mention and said to myself you got to be kidding. I then said to myself if I add 50 of these pages with basically no content to be criticized can I raise my quality score. Nah, that would be too easy.

    Lapizuli




    msg:4347973
     10:09 pm on Aug 4, 2011 (gmt 0)

    Yes, now that you mention it, suggy, I've noticed that kind of non-content in the last week or so on my personal searches, and outland88, me, too - I was totally indignant.

    whatson




    msg:4347974
     10:12 pm on Aug 4, 2011 (gmt 0)

    Oh yeah, I remember now, they did get hit badly. 2.3 got them again as well. Ouch!

    CainIV




    msg:4348109
     5:25 am on Aug 5, 2011 (gmt 0)

    Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?


    How would Google understand or know this?

    And would genuine users want to know more about general topics that one would essentially find doing genuine keyword research?

    whatson




    msg:4348124
     7:02 am on Aug 5, 2011 (gmt 0)

    I think it's under the algorithm "where there is smoke there is fire"

    littlegiant




    msg:4348179
     11:56 am on Aug 5, 2011 (gmt 0)

    Some of this might be down to the rumour that Google no longer uses keywords in its algorithms but it could be a lost opportunity for site owners.


    What? Wait a minute... I thought it was actual fact that Google no longer uses meta keywords in its search algorithms. Matt Cutts even said so in 2009. Have things changed since then? A lot of my newer pages have no meta keywords on account of that announcement.

    jmccormac




    msg:4348206
     12:47 pm on Aug 5, 2011 (gmt 0)

    Well that's what he said then. It might be worth asking him if Google's policy has changed. Perhaps it hasn't.

    Regards...jmcc

    [edited by: jmccormac at 12:52 pm (utc) on Aug 5, 2011]

    suggy




    msg:4348207
     12:50 pm on Aug 5, 2011 (gmt 0)

    littlegiant

    I think this refers not to meta name=keywords and more to the belief that search/ google's document retrieval systems don't really operate on basic keywords anymore; phrases and n-grams are more important.

    Tedster can explain it better

    littlegiant




    msg:4348210
     12:57 pm on Aug 5, 2011 (gmt 0)

    @suggy,

    Woops. Sorry. Maybe I didn't quote enough of the comment...

    There's also a strange trend in sites dropping keyword metadata and concentrating on the description meta data. Some of this might be down to the rumour that Google no longer uses keywords in its algorithms but it could be a lost opportunity for site owners.


    Is he not referring to the meta keywords tag/attribute?

    jmccormac




    msg:4348215
     1:06 pm on Aug 5, 2011 (gmt 0)

    It was the keywords metadata that I was referring to, Suggy,
    Google's algorithm seems to be relatively complex but Cutts said that the keywords metadata tag had been gamed/abused and recommended people to use the description metadata. The content keywords issue is a different thing. Being utterly cynical about it, what better way would there be to stop it being gamed than to tell SEO people that Google is not using keyword metadata. Then perhaps they don't.

    What I am seeing in the surveys is a trend where the keyword metadata entries are being dropped entirely on some sites.

    Regards...jmcc

    suggy




    msg:4348326
     5:19 pm on Aug 5, 2011 (gmt 0)

    Sorry chaps. Skim reading. Understood now. However, I cannot imagine that google places more than the tiniest weight on meta keywords since it is so easily abused (because it doesn't show on the page) and so widely abused.

    I have heard of claims though of pages ranking where the term only appeared in the meta keywords, however that might be coincidental and just prove google is better at relating searches semantically, which we know it is doing more an more.

    tedster




    msg:4348331
     5:39 pm on Aug 5, 2011 (gmt 0)

    some of this might be down to the rumour that Google no longer uses keywords in its algorithms

    It isn't a rumor - it's a long established fact and it's not just Google. Going from memory I think the first official word about not using meta keywords was back in 2002 or 2003.

    Right around that same time, the H1 tag was also devalued because of abuse (this was announced at a PubCon in London). If webmasters abuse something and it can't be depended on as a relevance signal, the Google won't use it. IMO The H1 tag is now a useful signal again, but it's nothing like it used to be.

    Since those days in 2002 and 2003, so much has changed in the ranking algorithm. It's become very complex, and anyone still thinking in old school terms really should start playing catch up.

    The phrase-based indexing patents [webmasterworld.com] took Google light years beyond text-match indexing. You may have noticed that exact text=match searching is not eve very dependable these days.

    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Google / Google SEO News and Discussion
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved