homepage Welcome to WebmasterWorld Guest from 54.227.12.219
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 216 message thread spans 8 pages: < < 216 ( 1 2 [3] 4 5 6 7 8 > >     
Panda key algo changes summarized
pontifex




msg:4289430
 10:18 am on Mar 30, 2011 (gmt 0)

Folks, I have been reading a lot, thinking a lot and analyzing a lot. I am still not sure, how to get the US traffic back to pre-24th of February levels! But I think it is time to summarize the key theories of the algo change in the US:

- Internal links devalued, only external count really

- Thin pages cause substantial bigger problems for a domain

- Duplicate content snippets on your page cause substantial bigger problems

- Too many external named links "widget keyword" instead of "more..." (eg) cause penalties


are what kept me working in the past 4 weeks. Do you have some additional meme?

P!

 

crobb305




msg:4300485
 4:26 pm on Apr 19, 2011 (gmt 0)

@jecasc - nothing about my site is fake.


Well said. We've devoted many, many years to our sites. Furthermore, according to my stats, my visitors have always liked my sites. My bounce rate has historically been below 30% (until Google implemented auto-complete and some other tools that increased my bounce rate). To me that was the only quality signal I have ever needed, and I built my site with that in mind (good content, easy navigation, etc).

Now, Google has indicated that quality sites should have a certain X and a certain Y, a fraction of THIS, a ratio of THAT, a trust seal HERE, a another item THERE, content THIS BIG, and ads no bigger than THIS, etc (speaking very, very generically for exemplification)...based on human trust/thought processes that might be successfully emulated by a numerical model for document scoring. We are working to determine what they are, from their ambiguous and often contradictory words, particularly with respect to ads (one employee says to take a close look at our ads, another says ads aren't a "big part of the algorithms").

I firmly believe that this new method of quality signaling and document scoring should work well down the road, as the machine learning progresses and the statistical database grows. Since I have studied and worked with mathematical models, I am an advocate of this change to machine learning (statistical modeling)...you can improve accuracy of model output when you analyze output clustering, exclude outliers, and use probability distributions to determine a final, expected result. With Google, this application is its infancy and many sites are in the collateral-damage basket.

[edited by: crobb305 at 4:54 pm (utc) on Apr 19, 2011]

realmaverick




msg:4300499
 4:53 pm on Apr 19, 2011 (gmt 0)


Too many external named links "widget keyword" instead of "more..." (eg) cause penalties


This would imply breadcrumbs, which google actively uses in it's results. Mine go something like: home > widgets > big widgets > big blue widget

So obviously a lot of replication. But they're descriptive, always show in the results and the users like having them there.

I guess google may look at breadcrumbs differently.

crobb305




msg:4300506
 5:05 pm on Apr 19, 2011 (gmt 0)

Here is are two things I have given thought to over the past few days (and tweaked on my Pandalized site accordingly):

1) Excessive Whitespace. My site doesn't use a standard template with beautiful images...it is an old site that I built in html 4.01 with some now deprecated tags. Well, I just discovered that I had a table width on all my pages set to fixed 800px (standard), but a TD cell width inside the table set to 1600px. I have no idea if this was causing any display/rendering issues. It seems to me the width would default to the overall table width and that the 1600 would be ignored. Nevertheless, the thought occurred to me that the 1600 could be somehow used in the content area calculation (and there is a lot of whitespace in that cell). Also, I had large line spacing and large cell padding. I reduced these and the pages really seemed to (visually) reduce whitespace and scrolling. Again, I am not sure how the algorithm might be using the overall dimensions to calculate text areas and determine "thinness".

2 Usability/Compatibility...especially with mobile phones (not sure if it was mentioned early on yet): I have used a tool to test my site's compatibility and it failed miserably on Android/iPhone. This is a reasonable consideration if Google testers used different browsers and mobile devices (mobile is a big issue for Google I think). Nested tables can cause problems on mobile devices.

[edited by: crobb305 at 5:09 pm (utc) on Apr 19, 2011]

pontifex




msg:4300513
 5:08 pm on Apr 19, 2011 (gmt 0)

@crobb305: that would fit the point "technical elegance" suggests quality :-)

crobb305




msg:4300519
 5:11 pm on Apr 19, 2011 (gmt 0)

@crobb305: that would fit the point "technical elegance" suggests quality :-)


Ah, good fit :) Well, I shared my experiences with my site from the last few days. I am disturbed by some of the sloppy code I am finding. I am really thinking about converting to a well-constructed template that doesn't use nested tables.

One more factor that I am finding common among many well-ranked sites (that could be a preferred human signal): Privacy policy. Most of the well-ranked sites I have examined in my competitive industry have a footer that contains critical pages like "Privacy", "About Us", "Terms of Use", etc. One of my unaffected sites (soared after Panda) has this in the footer of all pages; my Pandalized site didn't (does now).

I think Google is sensitive to the privacy issue right now (after the Buzz lawsuit) and many people look for a privacy policy on the sites they visit. They may expect to find it in a certain place on the page (footer?). From a user-behavior perspective, the lack of a privacy policy could reduce the perceived credibility of the site and increase bounce rates. The same tool I mentioned above that detected issues with compatibility also pointed out my lack of legal compliance. I only had a privacy policy on my homepage, but not on all pages.

indyank




msg:4300577
 6:24 pm on Apr 19, 2011 (gmt 0)

Duplicate content snippets on your page cause substantial bigger problems


does this also include snippets from pages on the same site? i am seeing a few sites making their Home and other archive pages like categories, tags etc. into just links. There is no content summary for those header links and alexa shows a steep hike for those sites!

I have also noticed a hit when one page referred a few lines from another page.

Example - I had described the features of a widget in bullets. For another widget that had almost the same features, I had repeated this list.But the rest of the content on the two pages were different. Still, Panda seem to have caused problems for pages linking to them or linked by them.

Is this a serious issue now?

pontifex




msg:4300584
 6:39 pm on Apr 19, 2011 (gmt 0)

@indyrank - we are just gathering the ideas in one big list from various sources and everyone has to test and play around for him/herself... Also: I think, changes could take weeks to show effects! As I mentioned earlier: let's just make a list of estimations that you CAN use in your SEO work... ;-)

crobb305




msg:4300618
 7:23 pm on Apr 19, 2011 (gmt 0)

@pontifex, I apologize if I get too wordy lol. I type very quickly, train of thought.

Some of my new suggestions, as you said, fit into previously mentioned cateogires; but I didn't see one where on-site pages that address critical areas that users (like me) look for (privacy policy, contact email/form, terms (probably not necessary if you have a pp), and an About page. I know we have 3rd-party verification/certification on the list. Some companies will provide a privacy policy verification which can add to credibility.

pontifex




msg:4300703
 8:36 pm on Apr 19, 2011 (gmt 0)

No worries, here is the updated list in a better, more compact layout:

  • Reading levels: If you go to "Advanced search", you can filter SERPS by "reading level". Think about it, test it!
  • Bad templates: Too many pages using one single template (Wordpress like) could cause GBot nausea.
  • Internal links devalued, only external really count
  • Thin pages cause substantial bigger problems for a domain
  • Duplicate content snippets on your page cause substantial bigger problems
  • Too many external named links "widget keyword" instead of "more..." (eg) cause penalties
  • missing positive reviews from the usual review sites count as a minus
  • the (low) quality of a link destination could backfire on the quality score of the link source
  • missing certificates/page seals of organizations (BBB maybe?) could give a missing signal
  • user behaviour (satisfaction) on-page (measured by plug-ins or analytics you have on the pages) could give a quality signal
  • Content above the fold: implying that G renders the page and estimates the content quality early shown
  • Text blocks with an (ancient) date of publication could catch a devaluation
  • Spelling and grammar: errors might get you sacked
  • Technical elegance gives extra points


If nobody wants to add anything, I call this my list for the next 2 weeks. Lots of work already has been poured in and I will post an updated before I speak about it on a4u in Munich anyhow ;-)

bluemountains86




msg:4300813
 12:08 am on Apr 20, 2011 (gmt 0)

Interesting thought from incrediBILL about review sites being hit.


i think this make no sense because both main reviews sites (yelp and tripadvisor) have gained traffic with panda

zehrila




msg:4301193
 11:09 am on Apr 20, 2011 (gmt 0)

The possibility of external links being devalued. Lets say Site A, B, C and D link to Site X, but later site A and B get pandalized and their link power reduces, which shall eventually result in lower ranks for site X, that does not mean site X is also pandalized. Its just an hypothesis.

[edited by: zehrila at 11:09 am (utc) on Apr 20, 2011]

4dwebhosting




msg:4301194
 11:09 am on Apr 20, 2011 (gmt 0)

Has anyone considered loading times? Google has stated that this is important and have been reviewing this for some time now.. I know one of my sites which has been affected does have a rather slow loading time.

NamG




msg:4301316
 3:16 pm on Apr 20, 2011 (gmt 0)

Is it that Google penalizes old content, or that they reward newer content?

I switched the "date published" info with "last modified" and so the date updates often for me, as I edit my pages relatively frequently. It is clearly marked as last modified, which I think is more useful for my visitors anyway than an original publishing date.

Panthro




msg:4301378
 5:11 pm on Apr 20, 2011 (gmt 0)

@pontifex - Maybe you should post the updated list on the OP and edit as this continues?

As for the idea of dated pages/articles, I don't see why an algorithm based on human behavior should rank new content above old. I would imagine having any date on a page/article/chunk of information should add to the "trust" factor.

Good thread so far though.

crobb305




msg:4301381
 5:12 pm on Apr 20, 2011 (gmt 0)

Is it that Google penalizes old content, or that they reward newer content?


Because Easter is so late this year (April 24, which is the latest I can ever recall), I began searching to see how frequently it occurs this late. One of the top-ranking pages was a Time article. It stated that this was the first time in 100 years that Easter has occurred this late. Many people have "liked" and tweeted the article, so it's obvious that people are sharing the little tidbit. Well I glanced up in the upper left hand corner of the article and in very light gray was the original publish date: 1943! The article has no relevance to this year's Easter and was useless for what I was trying to find. Now all those folks who tweeted and shared are spreading false information. lol

Has anyone considered loading times?

Yeah, I have thought about this. I see some of the sites on the Sistrix list have slow-loading ads/js/server requests, etc. This past weekend, I got a wild hair and added AddThis share buttons to 4 of my pages. It quadrupled the page load time and Googlebot noticed (spike in "time spent downloading a page"). It was obvious to me and to others who tested. Today, my rankings have dropped two or three spots on a few phrases. It may be related to the sudden increase in load time. I removed the buttons. I was trying to create a way for people to share my pages (since that is the way things seem to be going), but it could backfire if page load time can drag your rankings down.

pontifex




msg:4301398
 5:39 pm on Apr 20, 2011 (gmt 0)

@Panthro: thanks, but I can not change the old opening post any more :-( so the list decays down the thread, I guess.

I think one point, that rand fishkin is all over about on the web is missing, btw:

- User behavior: G gathers it from Chrome, Toolbar, Analytics and factors it in. It is rumored to be VERY strong.

crobb305




msg:4301485
 8:15 pm on Apr 20, 2011 (gmt 0)

User behavior: G gathers it from Chrome, Toolbar, Analytics and factors it in. It is rumored to be VERY strong.

Matt Cutts stated explicitly in a video last year that Analytics weren't used for ranking pages. Matt has to be careful with his words, so he can be ambiguous at times; but in that video he very clearly said "no" when the question was asked. I tend to believe him on that point. This doesn't exclude Toolbar and Chrome data which are likely being used.

The following writeup (and linked patents) provides so much insight into some of the data Google is using, more so now than ever before: [seobythesea.com...]

Bewenched




msg:4301488
 8:32 pm on Apr 20, 2011 (gmt 0)

I am pondering anything that can be programmed into the algorithm to be an indicator of quality, in particular, "Reputed Credibility" (i.e., 3rd-party certifications and awards). Among these MIGHT be SSL Certification, privacy verification, hacker-free certifications, respected/recognized awards, etc. These may or may not be playing a role right now.


This isnt not a factor right now, we have MANY certifications including:
McAfee/Hackersafe
SSL verifications
Authorize.net certificate

We're ecommerce so we visibly show them, and we were still hammered by Panda

crobb305




msg:4301496
 8:45 pm on Apr 20, 2011 (gmt 0)

This isnt not a factor right now, we have MANY certifications including:
McAfee/Hackersafe
SSL verifications
Authorize.net certificate
We're ecommerce so we visibly show them, and we were still hammered by Panda


That doesn't mean they aren't a factor (trust signals). You can't say one item is or is not a factor just because you were or were not hit by Panda. The algorithms are complex mathematical models and an unknown number of quality signals are being used.

Going back to when we first started discussing... this is a big clue from the Wired interview:

Singhal: We wanted to keep it strictly scientific, so we used our standard evaluation system that we’ve developed, where we basically sent out documents to outside testers. Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”

Wired.com: But how do you implement that algorithmically?

Cutts: I think you look for signals that recreate that same intuition

[wired.com...]

What are the signals of trust? This may be just one element of Panda. For those factors, you may score well. Maybe something else is bringing your ranking down. Maybe your site is a false positive.

We can't keep restating ourselves over and over and over. We're going in circles. Our list is just a set of ideas. There is no right or wrong. There are no certainties.

migumbo




msg:4301764
 8:52 am on Apr 21, 2011 (gmt 0)

I've read the entirety of the other threads on Panda, and now this one, and it has to be said, that a linear forum isn't the most ideal way to dissect the importance of various issues and/or present the findings in a clear and logical way.

For instance, my sites Ive been working on with buckets of well researched well written content have been smashed. While sites I haven't touched for 2 years that I couldn't give a crap about have been given a boost. This doesn't sound typical from what I've read, but again it's hard to tell.

What about a public spreadsheet to start off with to list various factors that might affect a site, then a Google survey (or user voice forum or similar) where WebmasterWorld members can try and identify the importance for various site issues, as they relate to Panda?

Would be interesting to see what we come up with

Andem




msg:4301776
 9:19 am on Apr 21, 2011 (gmt 0)

I don't know whether these observations are exactly relevant to the summary, but one thing I've observed is that sites such as The Well and Daniweb and many more seem to rank #2 in SERPs after Wikipedia when searching their respective company/site titles.

@migumbo: I understand your frustration. Articles which have existed as far back as the late 90s have been quoted and even rewritten as Wikipedia articles, some with reference links, and yet both wiki and hundreds to thousands of scrapers are ranking above the originals.

heisje




msg:4301797
 10:59 am on Apr 21, 2011 (gmt 0)

Facts:

- 250 Drupal-based commercial sites
- all with fundamentally different content from each other
- ALL built with EXACTLY the SAME template and SEO
- 1/3 doing brilliantly
- 1/3 doing ok
- 1/3 doing really rotten

Go figure!

My take of the situation: while Google research & intentions may be top-notch, the RESULTS imply a boat sailing aimlessly. SERPS are a shame. Frustrated users turn increasingly to alternative search sources to find what they need, at least for part of their searches, for the time being.

My conclusion: the algo is in a complex bs state and out of any human control - any discussion about complex bs is a discussion about bs - don't touch any of your sites for the time being. Wait & see.


.

Whitey




msg:4301822
 11:54 am on Apr 21, 2011 (gmt 0)

Google's been strengthening its quality signals for some time. So my focus is on relevance through semantic linking / navigation heirarchies, and natural language and conversations . Google asked consideration of it's testers for "human factors" , so to me it smacks of a step towards a greater grasp by the algorithmn of language across a raft of known options.

Although the Google tech crew applied science , let's not forget it stems from human interpretations and feedback of what " looks and sounds purposeful and good " applied to a mathematical framework benchmarked against a questionaire.

If webmasters take a hard look and apply common sense, they should see through it a bit. But i do think that Google cannot possibly have got this right first time - it's one heck of an update , a lot of good folks got burned in the collatoral damage and I would think it will take a lot of work for some to reclaim some of their lost ground. Google's not about to react to every webmasters response - they don't need to.

So on the to do list , put back natural signals and language. Do away with machine driven content and paid links. I think this is the first big move in a shifting emphasis.

If only actions could be applied as quickly as words.

ScubaAddict




msg:4301858
 1:25 pm on Apr 21, 2011 (gmt 0)

Background: Our site showed a slight (but noticeable) drop in traffic a week prior to panda, then a very pronounced drop on Feb 24th. Thousands of pages of all original and popular content. We were one of the top leaders on our vertical for over a decade.

As far as load time goes, ads do slow the loading of many websites. Because of this, for the last couple of years, we have been loading all of our ads AFTER the entire content of the page loads. The user couldn't get to the content quicker. But even though the page appears to load instantly, the page has not completed loading until our ads load.

Now - we do daisy chain our ad providers: AD provider 1 defaults to AD provider 2 which defaults to AD provider 3. This adds to the load time of each ad space.

I wonder 2 things:
1. Could our attempt to provide the best user experience, by loading ads last be misinterpreted by google as somehow deceptive? All of the javascript that loads the ads is bunched together just above the </body> tag.
2. Does anyone else who was hit daisychain their ads?

Since panda hit, we have removed ads from thousands of pages. We have:
1. moved ads (in content) to below the fold - increasing content above the fold.
2. removed an ad from just about every page.
3. Stopped thin content pages from being indexed, and removed all ads from them.
4. Removed any duplicated title or meta tags. We dont have duplicate content.
5. Checked all outgoing links to make sure they are still active websites and valuable to the content of the linking page.
6. We are also starting to go after our numerous content scrapers (and successful in getting them to remove our content in most cases so far).

Panda 1 took about 40% of our traffic. Panda 2 (int) took about 20% of what we had left. We have noticed another downward trend since Sat Apr 16. Nothing seems to have helped one bit, in fact everything we do we continue to lose traffic.

Today I will remove the daisy chaining on the ad spots still remaining.

skweb




msg:4301880
 2:14 pm on Apr 21, 2011 (gmt 0)

Agree totally with heisje. I have the exact experience with a dozen or so websites. After extensive research and review of the posts here, my conclusion is that either G completely screwed up, or the algo is still a work in process, because if this is how G will run its search engine, it will soon lose its prominence.

Does anyone remember the old days of Google dance when G will do an update and results will be in a flux for days/weeks but eventually settle down? I hope this is the case.

BTW, I am just pretending that nothing has changed for now and have made absolutely no changes in my business model. I will wait till there is greater visibility.

crobb305




msg:4301897
 2:46 pm on Apr 21, 2011 (gmt 0)

my focus is on relevance through semantic linking / navigation heirarchies, and natural language and conversations . Google asked consideration of it's testers for "human factors" , so to me it smacks of a step towards a greater grasp by the algorithmn of language across a raft of known options.


EXACTLY. This is what I have been trying to say here and in other threads. I keep mentioning the complexity of numerical/statistical modeling and that there are no linear/single-factor answers. Your approach has been mine exactly. I also mention the various trust seals that a HUMAN might look for. These can be verified by Google to ensure they aren't fake. Furthermore, many visitors look for company links like "privacy", "about", etc, and they may expect to see them on a certain part of the page. These are things that a human reviewer might tell Google they expect to see when they visit a site.

it has to be said, that a linear forum isn't the most ideal way to dissect the importance of various issues and/or present the findings in a clear and logical way.


migumbo, my reply above addresses your comment also. We've been working for 6 weeks on a list of quality "signals". We realize there is no "linear" solution. That's not what this thread is about. We're not determining mathematical weight/significance. We're just listing some of the quality signals/page characteristics that humans may expect, and an algorithm can detect.

Google doesn't provide open source so I think the spreadsheet option is out. The best we can do is list like we have been doing for 6 weeks in this (and other) threads. We've had some great ideas, many of us have been making changes, and to be honest, my site is much better now. Bing/Yahoo rankings have improved, and Google traffic up 10 to 15% from last week. :)

Again, for newcomers to the thread, we've been listing many, many, many possible ways to signal AND improve quality (here and in other threads).

  • Mathematical models = very complex/non-linear.
  • Our discussion = simple list/linear (best we can do without equations and open source).

    [edited by: crobb305 at 3:05 pm (utc) on Apr 21, 2011]

  • pontifex




    msg:4301908
     2:56 pm on Apr 21, 2011 (gmt 0)

    When I see some rejections to gathering a list of "suggestions" in this thread, I just want to add: improving your site along quality aspects is never a bad idea, with or without Google in mind...

    P!

    crobb305




    msg:4301909
     3:09 pm on Apr 21, 2011 (gmt 0)

    Pontifex, this has been one of the best threads so far. That's why it made it to the homepage. We've worked hard for many weeks, and it's going to pay off. :)

    danijelzi




    msg:4301911
     3:11 pm on Apr 21, 2011 (gmt 0)

    Does anyone remember the old days of Google dance when G will do an update and results will be in a flux for days/weeks but eventually settle down? I hope this is the case.


    I'm seeing now rankings go like +5, -100, +100, -200, etc on hourly basis. Before Panda, I hadn't been looking at SERPs, so I don't know if if the big changes are because of Panda update or it happens all the time.

    pontifex




    msg:4301930
     3:35 pm on Apr 21, 2011 (gmt 0)

    Time to update this here, more added and slightly clarified:
    • User behavior: G gathers it from Chrome, Toolbar, etc. and factors it in. Eg. bounce rate (back to the SERP they came from) - was listed twice and has spawned many rumors so far.
    • Reading levels: If you go to "Advanced search", you can filter SERPS by "reading level". Think about it, test it!
    • Bad templates: Too many pages using one single template (Wordpress like) could cause GBot nausea.
    • Internal links devalued, only external really count
    • Thin pages cause substantial bigger problems for a domain
    • Duplicate content snippets on your page cause substantial bigger problems
    • Too many external named links "widget keyword" instead of "more..." (eg) cause penalties
      missing positive reviews from the usual review sites count as a minus
    • the (low) quality of a link destination could backfire on the quality score of the link source
    • missing certificates/page seals of organizations (BBB maybe?) could give a missing signal
    • Content above the fold: implying that G renders the page and estimates the content quality early shown
    • Text blocks with an (ancient) date of publication could catch a devaluation
    • Spelling and grammar: errors might get you sacked
    • Technical elegance gives extra points (loading speed, clean HTML)


    heisje says: The algo is flawed, do not change anything

    Crobb305 says: Again, for newcomers to the thread, we've been listing many, many, many possible ways to signal AND improve quality (here and in other threads).

    Tedster says: Quality score is not a sum of single factors but a decision tree of chained signals.

    And for the fun of it:

    2005 - [webmasterworld.com...] (dataguy started that very early!)

    P!

    crobb305




    msg:4301950
     4:05 pm on Apr 21, 2011 (gmt 0)

    Pontifex, I just posted this reply on another thread, but we have some overlap on two similar topics, so I am going to post it here also.

    One point that Bill Slawski makes [seobythesea.com...] :

  • Assessing the credibility of content and people on the web and social media: Modeling author identity, trust, and reputation

    This is something I mentioned a few weeks ago on a different thread. I suspect that author names could be profiled to determine their credibility (i.e., are they posting articles on a commercial site then posting in hubs, or vice versa?)

    Some of my writers have historically published content on other places, including some of the hubs like Buzzle or Ezinearticles. I think by having their name on my site could be lowering the credibility. I might be better off without even having the author's name. I have always selected writers who do good research, but you know how freelance writers work...they write for many people and write everywhere. It's perfectly reasonable, but unfortunately it could be guilt by association.

  • This 216 message thread spans 8 pages: < < 216 ( 1 2 [3] 4 5 6 7 8 > >
    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Google / Google SEO News and Discussion
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved