| This 33 message thread spans 2 pages: 33 (  2 ) > > || |
|Interesting little insight into What Google Trusts on a page|
This question/insight might be obvious to some... I've never thought about it until I started seeing it in the serps... and am starting to put the pieces together. Here it is:
First a little background, we were hit in Feb. Panda. And have never recovered meaningfully.
I've got all hand written content on our product site. (the quality is whatever it is... but the fact is that we never copied and pasted mfg descriptions like competitors). The site design is whatever it is also... our customers like it... most people may or may not be turned on by it.
Anyway we introduced a new feature a public Q & A section on the site over the summer... A way of giving consumers a public voice on the site and introducing a sense of community.
I've recently noticed that Google is ignoring the custom content in our descriptions (when they get snippets of the results in their serps) and displaying the customer questions in the snippets in the serps.
Old me would have said... whatever, it works right... we are in the results and I don't care what they put in the snippets... New, post panda me says: WTF, is google ignoring not only whole pages on the site...BUT, specifically ignoring sections of a page?
What I mean is: is google saying: the hand written content/product description you wrote (with the same exact string of key terms) is less important than the question the customer publicly asked on the page?
How this changes things for my Panda'd site:
If google is not trusting SECTIONS of a page... it explains why my deleting/removing ENTIRE (low quality) PAGES is not doing ANYTHING for me. Obviously if the above is true, even the higher quality content is being penalized because of it's geographic location on the code of the site.
Assuming the above is correct it means: the only way to get rid of Panda is to make the potentially panda'd sections of the site more trustworthy... not just write more content in the sections that are already not trustworthy.
Anybody care to give their thoughts/experience on this thought?
An interesting observation that deserves consideration.
Question: Does your "hand written" content sound like a sales pitch?
Question 2: Do the customer comments sound "genuine"? (not constructed to sell a product).
If yes, yes, then I wonder if Panda is tending towards "ordinary speech" and is less likely to reward text that sounds like it was from professional ad writers in the sales department?
@ Reno - Interesting questions. The descriptions are definitely designed to sell the product. It would be relatively easy for Google to design as part of their algo to discern the type of content and assign a sort of quality indicator to it, if they wanted..
The questions are not contrived... they are natural questions from an interested public. The users tend to:
1. Not use proper grammar.
2. Capitalization etc (out the window)
3. Short / non descriptive (generally)
I'm thinking that Panda may have taken the description section of the site and said: "we don't trust the content in this section of the website... lets tell the bots to 'ignore' this section of there site and see what happens."
In this way if the site doesn't have diversification of content throughout deep sections on the page... or website as a whole the site is stripped of the core content. In fact Panda could be applied to everyones site... not just a handful of people. It's just that we were hit harder because we were less diversified.
Think of ehow... or the other "targets" of Panda. They are generally made up of a content section (notice the absence of the S -plural-) and then a bunch of ads and links to other pages with LOTS of GREAT UNIQUE content. Google engineers ask themselves, if I that "section" out of their site... what happens, does the site have diversification of content enough to maintain rankings? Or does the site look empty.
Sites that do well have:
1. Multiple small sections
a. Main content
b. Secondary content
c. Tertiary content
5. etc. etc. etc.
So if you did what we did, and load up ONE section with tons of GREAT content without the diversification... and then Google goes in and says, well WE LOVE this site because of this SECTION, lets remove it and see what happens to this site... And if the site fails to provide enough UNIQUE content in other sections... well, that was PROBABLY a shallow site...
In other words they are hedging their bet. They figure they can't be sure about any one piece of information on your site. It could be duplicated/spun/nonsense... Or it could be great. So if you have multiple sections on the page that re-inforce the greatness of your content, you are buying insurance for your core content.
What it means is I could write the most beautiful content about my products until I'm blue in the face... BUT, if I want to rank I'll have to back that content up with new sections of content, that are unique and good quality.
I had the same observation about a month ago when Content_ed posted the article about Panda and Alexa rank(I spoke with him in IM didnt want to trow blind shots here). It's called content zoning: based on where and what content is positioned on your page might influence your content scoring.This has been out for long long time but since 1st panda it keeps getting more and more importance. I am 101% sure this is a huge game changer for majority of the pandalized websites.
I have to ask this (just because hopefully it MIGHT save you some time):
Have you checked in copyscape to make sure that no one is using your content somewhere else? Are you sure that no one scraped all your handwritten content?
@Donna I think I'm a panda victim from content zoning. From what I see big g is critical on the top 400 px of the page, no content in that space and "you have a problem."
Working on a redesign idea as we speak.
Pjman , not sure if its 400px or not but it has to do something with common position of reviews, inner page snippets, html tags and content overall. A massive amount of the pandalized sites have this in common and not dupe content or low quality content, well you can call it low quality content in this case based on the position and the randomness of it.
Having read this thread I'm left with tow questions.
How are these comments marked up in HTML?
Could newness be important?
@ Donna, I've respected your SEO/Webmastering advice for some time now. Thanks for weighing in here. As far as zoning goes, I think it's less important. Zoning by itself would be to easily gamed and does nothing but teach webmasters how to make a more uniformed web. I believe that Gooogle's vision is a better web... not a uniform web.
@ Planet, we are a small player in a specific niche, I have not run copy-scape... going to do that now.
@ Hissingsid, good to see you here sid! I think that freshness does have a factor... ALTHOUGH, I have to say that it would seem that Google is actually repressing the content that we labored over.
Google has taken the top content of every site and said, "Vuala, top content on every page on the web is now demoted. How does the page fair?"
if it can stand on it's secondary/tertiary/commentary/links/etc etc then it is most likely a deep/interesting/well rounded web page. If the page (after demoting the top content) falls to oblivion... "good riddance, we don't want shallow pages anyway..."
Steps they'd have taken:
1. Identify groups of content on a page (easy)
2. Grade the individual content sections per their algo (easy)
3. Demote the top graded content piece
4. If a website lost the majority of their unique content due to the demotion of the best content, that site is panda'd.
WHICH IS WHY SO FEW SITES RECOVER:
We are busy fixing our already GREAT content... we didn't/don't realize it's not about our great content... it's about a lack of additional mediocre content. (I say mediocre... what I mean is no matter how good a singular piece of content is on the site... it will always be flagged/demoted... therefore Google is grading you based on your less than stellar content...)
Put another way, it's like the bell curve. The professor has just X'd out the hardest question... the question that carried the most weight (but everyone, everyone except you) got wrong. He is now grading the test based on the answers that you didn't pay attention to because you knew you had gotten the correct answer on the big kahuna question...(the question you had studied for...)
This is just a theory... I don't pretend to know anything other than what I read and analyze on my own site.
While I think parts of this discussion have some merit, the rest of it sounds like a desperate struggle to make sense of Panda... grasping at straws. It is highly unlikely any search engine will ignore the "content" when it is content they wish to serve to their visitors. I would have to see much more compelling data to make me change that point of view.
Duplicate content or redundant content makes perfect sense for demotion in serps. This, I believe, remains the biggest problem for Pandalized sites.
@Lenny2 Thanks :)
btw as I said :
|A massive amount of the pandalized sites have this in common and not dupe content or low quality content, well you can call it low quality content in this case based on the position and the randomness of it. |
Your point is valid from my observations. The problem is the rate of panda updates and the changes with each one its pretty much impossible to pinpoint the issue. You might have fixed it for the 1.0 version but the 1.1 would break something else and so on its a cat and mouse game with google as usual. And then you get the other algo tweaks and it becomes a huge mess .
Webmasters need clearer guidelines, which will not happen anytime soon. They should promote the guidelines and not the penalties and filtering instead. These days its not about how to rank but how not to get penalized !
Regarding these ranking comments...do they use a standard convention? eg They're wordpress comments, drupal comments, etc...and have the classes/ids typically expected to accompany these types of comments?
Big question is how does google know this is a comment? Cutts was asked about acceptable nav-bars once, and he made the comment that as long as you stick with one of the major ones, you'll probably fine, which to me suggest google profiles some of the more popular CMS'es and may know based on the markup profile that a comment is a comment.
I suppose the other thing is perhaps google tracks the age of different sections of a page. So if the top section never changes...but additional stuff gets added at the bottom it may be considered more fresh and trump the content at the top.
|These days its not about how to rank but how not to get penalized ! |
That depends on whether you consider Panda to be a ranking factor or a penalty measure. I've made this distinction several times since last February, and for me it is NOT a mere nit-pick.
In my view, Panda functions as a ranking factor, a part of the overall algorithm. To consider it a penalty (which many people have) can blind us to an effective approach. That is almost always to do something more, something proactive - rather than merely to eliminate something that is "causing a penalty."
Just had a little lol after finding this twitt [twitpic.com ]. Another good example.
No wonder it's harder to rank on Goog these days when there's something that weird going on in the algo.
Just had a little lol after finding this twitt [twitpic.com ]. Another good example.
As an English person myself, I think that's a pretty good definition.
As far as eCommerce goes with Panda penalizations, has anyone tested the theory of larger image sizes on the page?
We display thumbnails that link to the larger image instead of making the page heavier by having larger image sizes displayed when the page loads. I wonder if this could be some sort of quality signal?
@ Bewenched Before we got panda'd our images were the biggest in the industry... In fact we reduced the size drastically as one of our "fixes" for Panda.
I noticed something for thie first time today that I might have missed if you hadn't mentioned page areas. I was filing DMCA complaints for a moderately long page of mine, and as I went through it picking a unique phrase from each paragraph to check, my page blinked in and out of the Google results. When it was out, not even the "show additional results that are similar" brought it up.
The difference was this. The phrases for which my page failed to appear were phrases from paragraphs that had been copied into relatively short articles that were then syndicated! And due to the fact that only a few scattered paragraphs on my page were deemed juicy enough for the particular black hat link building effort, the result is that on a single page of mine, there are sections for which Google considers me the undisputed source, and sections for which they think the page does't even rate a second chance appearance.
In, out, in, out. Firt time I ever noticed that for a single page.
@Content, that is what I'm talking about. Though I don't think it has to do with the position. It's got to do with Google not trusting the Content in that section.
I didn't mean it had something to do with where it was counting paragraphs up or down, I meant that the sections they didn't trust were EXACTLY the sections that were stolen and syndicated.
I think you are reading too much into it. From my experience your results are not uniform enough to warrant the guess you made or any other guess anyone would make as to what changes Panda did.
The answer is really simple - thanks to Panda Google is now broken.
If the many search examples floating around the internet (including a few good ones on this forum) are not example enough that Google is not working right since Panda, consider this experience of mine:
Less than 2 years ago I created a website to test something. This website consists of exactly 2 pages. That's it, just two pages with unique content, but nothing really special. Post-Panda this website is ranking in the top 50 for an extremely competitive phrase returning over 100 million results. How competitive? I am now getting actual traffic (although relatively small compared to the potential of page 1-2 ranking), i.e. real people visiting the website, not webmasters or bots.
How did this happen? I don't know...I've been looking into every possibility in the past few months trying to figure out how in the world did this happen and there is just no answer:
1.) the website has only two pages
2.) the layout of the website is a simple table, not even a logo :)
3.) absolutely no black-hat (or white hat) techniques
4.) there are some links to it, which I acquired mainly to ensure that the website will be found by Gbot. Compared to the competition in that niche, it's the equivalent of a drop of water in an ocean.
5.) once the website was up - nothing had been changed ever.
6.) content is unique but generic; nothing really special.
Let's face it, this website shouldn't even show in the top 1000. This tells me one thing - Google is broken and it's broken really badly. I personally am taking the "wait and see" approach with my Panda websites and believe that after the Holiday season there would be major changes by Google trying to fix the damage done by Panda. There must be, I just don't see how Google could continue serving this type of results in the future...
So I changed the first sentence of each paragraph on my page where different sections were stolen and widely syndicated a couple years ago. Also added one new paragraph to the top.
The next day, the page only appears in Google search for exact phrase from the new paragraph or the new sentences. For quotes from the rest of the page, Google mainly comes up with pages that no longer exist because I've DMCA's them out of existence.
In other words, the massive syndication of the stolen bits combined with the uusal number of whole page copyright infringements convinced Google that my page was some kind of mash-up or scraper, despite the fact it had hundreds of incoming links from the days before Google hated it.
@Content_ed check my reply to your post so far I am out of the pandabox for 5 days now, lets see if it lasts.
@Content; so if it's not about the content... and it's about the impression that your site is a mash-up/copy infringing site... How do you improve your content to get out of Panda? Seems like no matter how much great content you put out there, you'd be running into the same problem, right?
I think a big part of it goes into sites being repetitive in their pages. For example if your sites main focus is widgets, you are going to intentionally (or in a lot of cases unintentionally) try to make sure the word widgets is mentioned anywhere possible. So then you have all your pages describing a different kind of widget. Well ok...now google may see this as redundant content. If your site is about widgets, It automatically assumes your article talking about blue widgets, is about a type of widget. The word widget does not need to be mentioned as it just starts to become assumed that it is talking about widgets because everyone including G knows this site only talks about widgets.
I do notice in several industries, the sites that received a major boost had sections on their site completely different than that mentioned on the home page. Google can see your homepage and right away see what your main focus is. If your home page title is labeled "Widgets" then G can then see if you are trying a bit too hard to rank every which way for widgets. But now if you also have an entire section devoted to tidbits and then an entire section devoted to ridgets you now have a diverse website filled with content that is not only related to widgets. G may see you did not place the words ridgets and tidbits in your homepage title so you created those sections for the sole purpose of it being useful to your visitors and not G.
Tidbits and Ridgets may not be widgets, but they may compliment widgets and users who are into widgets are probably into ridgets and tidbits. A lot of websites who try to game the algo usually have only one focus. There is a lot of money in widgets, so webmaster has motive to want to rank high for widgets. There is no money in ridgets and tidbits so webmasters motive is to provide useful content to its visitors, not profit off high paying industries such as widgets.
@brinked , young sir you are very right in your findings. I have been thinking about this since 2009 . It's a viable approach . One that proves why so many junk sites rank so high.
|How do you improve your content to get out of Panda? |
I can't "improve" the pages, they are publication quality, that's what led people to steal so much from them. In the case I was talking about above, which is one of the pages I 301'd to my "good" site, I decided to bite the bullet last night and spend three intense hours rewriting the entire page.
Because the page involves technical procedural steps closely tied to images, it was possible to reword them all, something I couldn't have done in two days if it had been a prose essay of that length. In fact, I don't think I could reword an essay of my own, would have to start with a blank page.
Every site differs. There is ZERO repetition on the Pandalized site in question. Subjects are broken down logically, like books, and each page is essentially a chapter.
Google doesn't consider the use of a widget word repetition. Amazon is full of the word "book", the word "dvd", etc. But as I wrote elsewhere, I am wondering if the fact that none of the most common words for the site appeared on the most popular pages is now being used as a quality signal by Panda. It never used to matter. If that is the case, it's an easy fix by breaking up sites and moving content around, should be obvious by next month with some more experiments.
And Google would never know our main focus from the home page, it would only learn what the bulk of our pages were about. Those subjects (think my Civil War Reenactment example) are simply much, much less popular than the tiny section on nuclear power generation that gets all the links and traffic.
Never mattered pre-Panda. Our traffic all came from deep links to popular pages, not via the home page. If it matters now, so be it, but it's got nothing to do with quality.
I think you misunderstood my post.
Amazon offers a wide range of products so it is completely irrelevant here and goes against what I was saying. Unless every page on amazon has the word "book" and/or "dvd" in it, it doesn't fit what I was saying at all. I was talking about specialty sites that have the same main word or phrase in pretty much every page of the site.
The point I am trying to make is that there is certain behavior webmasters follow when they try too hard to rank for a wide range of keywords under the same spectrum. I am talking about a balance in content. How your site is broken down has nothing to do with my statements.
To say your site has zero repetition is silly as every site has some kind of repetition. If your site is using a server side script such as php or .NET then there is definitely some sort of repetition. Regardless, that is not the point I was trying to make.
I also do not claim that anything is cut and dry. There is no single factor that defines panda, there is much much more that goes into it than that. It is all about trying to see what exactly panda is seeing. The quality of your site is irrelevant. Your site was hit by panda because google saw something on your site that made it determine it was not high quality. Your site might be the single best, most highest quality site on the net, but it does not matter if google does not think so.
It is no longer about what we consider to be high quality. Think of the SEO game as being a student in school. Your project may be brilliant, it may win you fame and fortune in the real world. However, if your teacher thinks it is crap than none of that matters. Google is the teacher and there is nothing us students can do about it but impress our teacher.
When you can stop talking about how great your site is and start looking at the real flaws of your site, then you will have a better idea of what direction to go in. I don't care how great any given authority site is, every site has flaws and every site can be improved in some form. Websites are made by people and nobody is perfect.
I have been asked many times by webmasters to review their site, some from this very community. They all come to me saying how great their site is and how their site should have never been punished. In every single instance I found many flaws. Sometimes webmasters are reluctant to take my advice because they are stubborn in their ways. This is no longer about how many SEO articles you read or how long you have been in this game. The game is always changing so you can chase the algo or you can actually look at your site with an open mind and try to pinpoint its actual flaws and improve them. Even if you do not recover, improving your site can't do much harm, especially if you're already being punished by google.
Whether we were talking about different things or not, I don't see any data to back your theory.
On repetition, I'm talking about what would be perceived as repetition, not a "home" link appearing on every page. There are websites that pick a theme and beat it to death with pages that differ only in the choice of title and the particular keywords they are written around, when all are really the same subject/formula. That's something Panda should have gone after, and might have to some extent. It's not us.
I don't think either of us believe that Panda targets, much less understands, quality. But it's impossible to have a discussion about websites without refering to quality. When it's not in quotes, it's the plain meaning of the word as understood by humans, quality.
People who tell me to look for flaws are just talking to themselves. No CMS, no script (other than Analytics), pure, simple HTML. If you want to call that a flaw, knock yourself out. Amazon and Wikipedia are full of flaws. The only flaws I'm interested in today are Panda flaws, and if I can detect them and adapt without compromising on the quality of my work, I will.
I don't need any more wembaster reviews, thank you. It's not about SEO with us, it's about quality content. Everybody isn't interested in reading SEO articles, I couldn't name a single SEO blog unless you count Cutts, and I rarely look at that.
I do read the posts here when I'm actively working on Panda, and it seems to me that there are a limited number of people sharing data and a lot of people sharing hypothesis. And discussions like the one you and I are having can't help anybody:-)
| This 33 message thread spans 2 pages: 33 (  2 ) > > |