homepage Welcome to WebmasterWorld Guest from 54.198.224.121
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 216 message thread spans 8 pages: < < 216 ( 1 [2] 3 4 5 6 7 8 > >     
Panda key algo changes summarized
pontifex




msg:4289430
 10:18 am on Mar 30, 2011 (gmt 0)

Folks, I have been reading a lot, thinking a lot and analyzing a lot. I am still not sure, how to get the US traffic back to pre-24th of February levels! But I think it is time to summarize the key theories of the algo change in the US:

- Internal links devalued, only external count really

- Thin pages cause substantial bigger problems for a domain

- Duplicate content snippets on your page cause substantial bigger problems

- Too many external named links "widget keyword" instead of "more..." (eg) cause penalties


are what kept me working in the past 4 weeks. Do you have some additional meme?

P!

 

walkman




msg:4289884
 1:59 am on Mar 31, 2011 (gmt 0)

Don't know if dated content should be removed yet, but it appears to be ONE signal Google is using, we suspect it's targeted at REVIEW sites and I'm certainly not running a review site but bugs are bugs.


Interesting and makes sense--up to a point. I will investigate. I wonder if I had 'review /s' on my titles (I urge people to review of course)

pontifex




msg:4290013
 9:23 am on Mar 31, 2011 (gmt 0)

Ok, now the list shows:

- Internal links devalued, only external really count

- Thin pages cause substantial bigger problems for a domain

- Duplicate content snippets on your page cause substantial bigger problems

- Too many external named links "widget keyword" instead of "more..." (eg) cause penalties

- missing positive reviews from the usual review sites count as a minus

- the (low) quality of a link destination could backfire on the quality score of the link source

- missing certificates/page seals of organizations (BBB maybe?) could give a missing signal

- user behaviour (satisfaction) on-page (measured by plug-ins or analytics you have on the pages) could give a quality signal

- Content above the fold: implying that G renders the page and estimates the content quality early shown

- Text blocks with an (ancient) date of publication could catch a devaluation

- Spelling and grammar: errors might get you sacked

----------------------

... as the current summary. Tedster says: Quality score is not a sum of single factors but a decision tree of chained signals.

incrediBILL




msg:4290025
 9:48 am on Mar 31, 2011 (gmt 0)

Text blocks with an (ancient) date of publication could catch a devaluation


define ancient. I used 1999 as an example to emphasize old, I have content dated to '97 even, while some people report issues with 2005, 2008, etc. so it's hard to say what the trigger is here but something is definitely going on with some sites.

TheMadScientist




msg:4290033
 10:23 am on Mar 31, 2011 (gmt 0)

...linking to a low grade site can drag down your site (I think that is what you were saying).

Close, more specifically, page to page.

EG I think it's reasonable if Page A starts off with 70 quality points and links to Page B with 40 quality points, regardless of site, Page A might see a quality score decrease when the quality score of Page B is taken into account ... If 5 of 10 links on Page A point to '40 point quality' pages and the other 5 of 10 links point to '90 point quality' pages I could see Page A's quality score being decreased by a greater amount than if 1 link went to a '40 point quality' page and the other 9 went to '90 point quality' pages ... I also think it's possible the converse is true and links to 'higher quality' pages could increase the overall quality score of the linking page ... Again, this is just speculation, and as tedster noticed the other day, I like kool-aid...

Shaddows




msg:4290061
 11:18 am on Mar 31, 2011 (gmt 0)

On the subject of linked pages, I've tested INTERNAL linking such that links point to irrelevant pages (test was to differentiate PR circulation from semantic scoring).

I am satisfied that IRRELEVANT links cause a weakening of the HOST page on semantics (un unexpected result at the time- I only expected downstream effects).

Now "Quality" is being folded in, I would be very surprised if the same methodology is not applied.

In my veiw, your pages inherit certain attributes from pages you link to.

econman




msg:4290148
 2:15 pm on Mar 31, 2011 (gmt 0)

It seems logical for

IRRELEVANT links [to] cause a weakening of the HOST page


Google likes us to exercise good "editorial judgment" and they don't like us to do things to "game" their system. This would be consistent with that preference, and it could be an effective tool for pushing the content farms farther down the SERPs, without equally impacting sites like Wikipedia. In other words, it could help them downrank pages on large sites which use automated process to manipulate their internal linking for SEO purposes, without having an equivalent impact on sites that use editorial judgment in selecting their internal links.

tedster




msg:4290186
 3:26 pm on Mar 31, 2011 (gmt 0)

Now "Quality" is being folded in, I would be very surprised if the same methodology is not applied.

In my veiw, your pages inherit certain attributes from pages you link to.

That very much lines up with some data I know about, but I hadn't pinned it down he way you have.

Another observation - in one site that took a hit, the percentage traffic drop seems to spread out from the worst-hit pages in a PT like fashion, rather than just a multiplier or subtracted amount being applied cross-site. It's like Google is saying "we want to keep our users away from the weakest sections of this site".

ken_b




msg:4290196
 3:41 pm on Mar 31, 2011 (gmt 0)

Excuse my ignorance, what is a
a PT like fashion

Joshmc




msg:4290200
 3:44 pm on Mar 31, 2011 (gmt 0)


- Duplicate content snippets on your page cause substantial bigger problems

How big of a problem have we seen this to be? Is this like sentences on different pages or like the main blog page has the same content as a post that is clicked to from the main page

pontifex




msg:4290214
 4:07 pm on Mar 31, 2011 (gmt 0)

@Joshmc - that problem is not new and a clear answer not easy. We were just listing that bullet point as something that might has a stronger impact now, after Panda than it had before.

Based on the original idea that gave it the name "Farmer" update, Google wanted to devalue pages that copy from other sites and aggregate for the sake of Rank + Ads (like MFA) = money and with no benefit for the surfer...

So taking that "dupe content" issue into the list seems to be the right thing here.

tedster




msg:4290217
 4:12 pm on Mar 31, 2011 (gmt 0)

Google wanted to devalue pages that copy from other sites

Even more - they want to devalue content that is just churned out to match keyword searches, rather than created, first and foremost, to meet the needs of the visitor.

zerillos




msg:4290475
 1:24 am on Apr 1, 2011 (gmt 0)

What about CDNs? Could having a CDN have an impact on the quality factor?

Shatner




msg:4290482
 1:46 am on Apr 1, 2011 (gmt 0)

>>Don't know if dated content should be removed yet, but it appears to be ONE signal Google is using, we suspect it's targeted at REVIEW sites and I'm certainly not running a review site but bugs are bugs.

What do you mean by "dated content"? Are you specifically talking about pages with date stamps?

So if a page has a date stamp, and that date stamp is from 2008 or something, Google may penalize it because it's old?

crobb305




msg:4290488
 2:00 am on Apr 1, 2011 (gmt 0)

Based on the original idea that gave it the name "Farmer" update, Google wanted to devalue pages that copy from other sites and aggregate for the sake of Rank + Ads (like MFA) = money and with no benefit for the surfer...


I want to give the Adsense team some kudos. When searching snippets from my homepage, I came across a junk scraper site. Now, they weren't just scrapying MY content. They were auto-scraping sentences and paragraphs from hundreds of sites and creating montage pages that made no sense whatsoever. A single paragraph might start talking about mortgage rates and end with animal hygiene. Total gibberish. Nevertheless, Adsense ads were covering the site. So, I filled out the Adsense violation form and pointed out the scraped nonsense that "violated the Google Webmaster Guidelines". The ads were gone within 24 hours.

This was a clear-cut case, so it made the task easy for the Adsense team. Not all scrapes are so easy to verify without a formal DMCA investigation. Nevertheless, if you spot such junk running Adsense, fill out their form.

Back to the discussion, does anyone have a feel for the impact of running a cookie-cutter privacy policy (provided by a well-known privacy-verification company)? I allow my policy to be indexed, since their existence is a sign of quality. However, searching snippets from the page, I can find thousands of copies. They are indexed in Google, so they aren't disregarded.

Finally, I have a theory about having a blocked redirect file (houses your 302/redirected affiliate links), that is denied access to Googlebot via robots.txt. Over the years, my affiliate links came and went, and Google was finding the redirect urls throughout my site, but couldn't follow them because of the robots.txt denial. Even affiliate links that I deleted from the file years ago were still showing up last week in the site: search. I only have 4 active affiliate links, but Google was showing 30+. So, to eradicate those, I allowed Gbot access in robots.txt, 410'd the dead parameters, and submitted through the Google URL removal tool. Now, Google only shows the 4 active redirect links. It is POSSIBLE that because they were blocked, Google may have thought they were thin content (similar to tags).

We'll see what happens. For what it's worth, 5 out of 6 Pandalized affiliate sites that I examined were doing the same type of redirects, blocked to Googlebot. Many of the top remaining sites do not do this, including two of my own sites that actually gained position after Panda. By the way, I am not suggesting that the robots.txt denial in and of itself is bad, but the accumulation of dead/blocked urls may increase the "thinness" of the site. Make sure Google isn't indexing old/dead redirects that Googlebot can't follow.

potentialgeek




msg:4297331
 2:56 am on Apr 14, 2011 (gmt 0)

This thread is a great primer for those of us just hit by Panda.

> - Internal links devalued, only external count really

I think I see this, too.

Shatner




msg:4297333
 2:59 am on Apr 14, 2011 (gmt 0)

@potentialgeek

Be warned that none of this has been proven, and no one implementing anything based on these ideas has seen any recovery.

SmAce




msg:4297445
 9:01 am on Apr 14, 2011 (gmt 0)

Interesting thought from incrediBILL about review sites being hit.

We have a review site, but it is not UGC - all reviews are written by a professional in their field, and it's all on one subject. All reviews are unique. Users can however, comment on the reviews.

We seem to have been hit by the Panda update but I couldn't work out why.
It would make sense that this is the reason we are being hit as we do use the word Review quite a lot in out titles.

If this is the case, I hope we have been caught by mistake - I am going to hold off making any changes to the site at the moment to see if Google make any changes to help legit sites get out of this mess.

sanjuu




msg:4297452
 9:24 am on Apr 14, 2011 (gmt 0)

Bill, if that is the case and Google has applied this wholesale across the board then if this is a big factor on some of the sites I've seen hit by panda - then it's a mistake on Google's part.

Take the automotive industry, and used car reviews as an example. If a site has reviews that are dated for a 2006 model that is very useful to the visitor who isn't looking for a review of the new version of the model, but is looking for a review of the 2006 model (which will most likely be when the car was first released in 2006).

So according to what some commentators have said, Google will be applying a filter/penalty/low-quality score to pages that clearly indicate the year of the review?

pontifex




msg:4297453
 9:26 am on Apr 14, 2011 (gmt 0)

@Shatner: very true - yet it did not hurt me either following this list in the past 6 weeks. It is a lot of work to clean up 4 Mio. pages with additional templates and infos and NOT be redundant, though.

2 more for the list - I added them at the top:

- Reading levels: If you go to "Advanced search", you can filter SERPS by "reading level". Now I think it would make sense AND be easy to connect "search phrases" and "surfer sastisfaction" (are they coming back to Google) to the "reading level" i have assigned to an URL

- Bad templates: Too many pages using one single template (Wordpress like) could cause GBot nausea. Just a hunch here, but could not hurt to diversify: if(template code > 30% of page) then rank=rank-1

- Internal links devalued, only external really count

- Thin pages cause substantial bigger problems for a domain

- Duplicate content snippets on your page cause substantial bigger problems

- Too many external named links "widget keyword" instead of "more..." (eg) cause penalties

- missing positive reviews from the usual review sites count as a minus

- the (low) quality of a link destination could backfire on the quality score of the link source

- missing certificates/page seals of organizations (BBB maybe?) could give a missing signal

- user behaviour (satisfaction) on-page (measured by plug-ins or analytics you have on the pages) could give a quality signal

- Content above the fold: implying that G renders the page and estimates the content quality early shown

- Text blocks with an (ancient) date of publication could catch a devaluation

- Spelling and grammar: errors might get you sacked

----------------------

... as the current summary. Tedster says: Quality score is not a sum of single factors but a decision tree of chained signals.

sanjuu




msg:4297457
 9:28 am on Apr 14, 2011 (gmt 0)

As for those saying that internal links have been devalued, can you explain your logic to how you came to this conclusion?

The sites that have been hit hard by panda seem to have some sort of site-wide filter/penalty/down-scoring applied to them. If this is the case then that would explain why internal links are having no/little effect? Or are you talking about sites that haven't been hit hard by panda but have noticed some smaller drops in specific areas of their sites?

Jessica




msg:4297500
 12:10 pm on Apr 14, 2011 (gmt 0)

About the INTERNAL links:

I have noticed that pretty much all my great ranking blogs got knocked a few pages deep. We all know how a complicated internal inter-linking structures WordPress has.

Definitely need to look into this internal linking issue more.

pontifex




msg:4297539
 1:35 pm on Apr 14, 2011 (gmt 0)

pageoneresults has a very good one:

- too many internal redirects, massive http requests per page or other technical issues could harm you badly...

I personally like that added to the list. Made sense before, makes more sense now!

Robert Charlton




msg:4300229
 4:49 am on Apr 19, 2011 (gmt 0)

On longtail "natural language" queries, I'm seeing extreme sensitivity to variations in the phrasing, more than I would expect. I'm talking about variations in what at one time I remember as stopwords, for the "how to tell if..."/ "how can I tell if..."/ "how to know if my red widgets are real" type queries.

As to be expected, if the "red widgets" or "red widgets are real" part of the query is very competitive by itself, then differences in the "how to tell..." area generally become less important.

From what I've observed in one small set of tests, longtail seems to trump AdSense where there's a fair amount of AdSense but not so much that it pushes content down below the fold or interrupts it. My ad "control" on these tests was a site I'd observed with a 728x90 Leaderboard at the top, a Wide Skyscraper and a Link Unit on the right, and a Medium Rectangle on the bottom... this vs an eHow page vs a page without any ads at all.

The test pages covered the same subject with similar vocabulary and ranked closely enough on some longtail searches that I felt they might be good indicators of which had more effect... ads or longtail query variations. It's possible that this ad control page was below the ad intrusion threshold, which would make this comparison moot.

- the (low) quality of a link destination could backfire on the quality score of the link source

I'm also noticing what may be the converse of this... that some link source pages appear to be getting boosted when destination pages are appropriate. This based on ongoing observations of pages I've watched over time... no rigorous testing.

Shatner




msg:4300261
 6:21 am on Apr 19, 2011 (gmt 0)

>>As for those saying that internal links have been devalued, can you explain your logic to how you came to this conclusion?

Check out our "don't fit the mold" thread. Some of our analysis there suggests that no only have internal links been devalued, they may be extremely detrimental in some cases... for instance if you have too many of them.

sanjuu




msg:4300305
 9:07 am on Apr 19, 2011 (gmt 0)

OK - I'll have a look. The issue with a lot of the factors is that each one alone doesn't seem to be the reason sites have been hit hard - as there are plenty that have these factors on their sites who haven't been hit by panda - in fact, they've benefited.

It does seem to be a complex scoring mechanism for pages, and this scoring does seem to percolate through sites via the internal linking.

Excessive internal-linking probably ties in with the idea of having lots of 'thin content' created to attract traffic via Google. It's something we're working on very strongly.

jecasc




msg:4300326
 9:54 am on Apr 19, 2011 (gmt 0)

I think this all leads to nothing. As tedster says: You are trying to create a checklist. That doesn't work.
Let's say you find out that Google now puts more value on link text in capital letters. What will you do? You will put all your link text in capital letters and change all the other factors you have found out. Then the Bunga Update will come and you will be back to square one. You think you are preparing your website for recovery but you are preparing your website for the next desaster.

And I can tell you what is wrong with your site: It is not natural. It is artifical. You find out that Google puts out emphasis on keywords in link text? So you put up a link structure like this:
red widget cheap
blue widget cheap
green widget cheap

Now Google will devalue this. So you will change your link structure to the newest trend. And it will look artifical again.

Because a normal website would look like this:

red widgets.
new blue Widget!
learn more...
widgets in green.

And now for the important part:
What is the value of having a normal average website?
Simple: You can't fit it into a mold.


No matter what Google does - it will never fully fit into the Florida, Farmer, Panda or whatever mold. It will always have an edge that sticks out and prevents it from fitting into the mold. Create websites from a drawing board and you are screwed. Create websites with corners and rough edges and you won't fall.

And when I read things like "linking to .gov" sources might do something to your website - good heavens. When you have started to think in categories like this you have left the field of website creation but have become a superstitious voodoo priest and can start creating a Google doll and prick it with needles. At least that won't hurt your rankings.

This is a problem of thinking in total wrong categories. I am telling you your website is too perfect and I can already imagine some people trying to think of ways how to create the perfect "not perfect" website.

pontifex




msg:4300329
 10:01 am on Apr 19, 2011 (gmt 0)

The whole idea of this thread is to gather possibilities for those who have been hit. Nailing the individual reasons down can not be part of it, yet. I have been working along this list the past 6 weeks. I especially focused on "thin content", "verification seals" and "outgoing links" and dive back into "clean and fast" once in a while. So far my loss is around 10%, recovering from Feb 23rd steadily!

Before I jump to conclusions, I would (as others mentioned here) wait until Panda really settles in. I guess the best time for a second run of tests on these factors is early May. Until then I do as much as I can on quality...

P!

PS: I agree that a checklist does not work, but it is rather a list of rules you apply alongside your work.

PPS: 10% loss for my site means over 350k in cash per year. An edgy site is not enough, you need an edgy site with traffic!

jecasc




msg:4300385
 1:05 pm on Apr 19, 2011 (gmt 0)

Maybe I have to realize that there are two type of SEO. SEO day traders and SEO long term investors. (Some perhaps doing a bit of both both).

I guess we live in different worlds. Different rules apply.

However this whole Panda discussion starts to remind me a little of the cargo cults [en.wikipedia.org] in the pacific after WWII. Only instead of airplanes people are trying to attract Google. If you want to attract Google you have to set up a real airport and not discuss if setting up another fake tower or mowing your fake airstrip twice a week instead of once a week will do the trick.

crobb305




msg:4300444
 3:03 pm on Apr 19, 2011 (gmt 0)

Maybe I have to realize that there are two type of SEO. SEO day traders and SEO long term investors.


I think most who have posted in this thread are "long term investors". We're talking about sites 7 to 15 years old that have been hit by Panda, and we're trying to identify signals of quality that we may have over looked. Again it's SIGNALS of quality, that an algorithm is programmed to detect. If we were "day traders" we would have trashed our domains and moved on to a new one by now.

Google is clearly experiencing some growing pains as they integrate more machine learning/statistical modeling into their ranking systems, otherwise we wouldn't be hearing about collateral damage, traffic from unrelated queries and poor semantic detection (e.g., ranking pages for unrelated phrases that include slurs/obscene language), Google asking for feedback, and Panda tweaks (i.e., "Panda 2"). If there is anything that we, as site owners, can do to help improve the quality SIGNALS on our sites (above and beyond the content we have focused on for the past decade), then as long-term investors, we're absolutely going to discuss it. We will do what we can to help Google's algorithm see our old, established, quality sites that we have worked so hard on all these years, for what they are.

So, let's not criticize those contributing here just because your site was spared THIS TIME. Google has warned that many changes are coming down this year so you could be next. Some of us would have never imagined that our 10-year-old sites would be caught up in a mess like this.

Google's algorithms are computer programs written by humans. They aren't written by God. They are fallible.

pontifex




msg:4300479
 4:09 pm on Apr 19, 2011 (gmt 0)

@jecasc - nothing about my site is fake.

I had 160k unique visits every day on the site and it started 2004.

Now I have around 140k and it hurts!

My site is a major local airport that G-Air used as a big hub. They were rerouting 17% of planes starting 23rd of Feb which I already improved back to 10% and crobb305 got the rest laid out pretty good.

If I get the 160k back, the next thing is: how to get 200k a day... SEO is about growth, not preservation - IMHO.

crobb305




msg:4300485
 4:26 pm on Apr 19, 2011 (gmt 0)

@jecasc - nothing about my site is fake.


Well said. We've devoted many, many years to our sites. Furthermore, according to my stats, my visitors have always liked my sites. My bounce rate has historically been below 30% (until Google implemented auto-complete and some other tools that increased my bounce rate). To me that was the only quality signal I have ever needed, and I built my site with that in mind (good content, easy navigation, etc).

Now, Google has indicated that quality sites should have a certain X and a certain Y, a fraction of THIS, a ratio of THAT, a trust seal HERE, a another item THERE, content THIS BIG, and ads no bigger than THIS, etc (speaking very, very generically for exemplification)...based on human trust/thought processes that might be successfully emulated by a numerical model for document scoring. We are working to determine what they are, from their ambiguous and often contradictory words, particularly with respect to ads (one employee says to take a close look at our ads, another says ads aren't a "big part of the algorithms").

I firmly believe that this new method of quality signaling and document scoring should work well down the road, as the machine learning progresses and the statistical database grows. Since I have studied and worked with mathematical models, I am an advocate of this change to machine learning (statistical modeling)...you can improve accuracy of model output when you analyze output clustering, exclude outliers, and use probability distributions to determine a final, expected result. With Google, this application is its infancy and many sites are in the collateral-damage basket.

[edited by: crobb305 at 4:54 pm (utc) on Apr 19, 2011]

This 216 message thread spans 8 pages: < < 216 ( 1 [2] 3 4 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved