| This 245 message thread spans 9 pages: < < 245 ( 1 2 3 4  6 7 8 9 ) > > || |
|Google Updates and SERP Changes - February 2011 part 2|
< continued from [webmasterworld.com...] >
But, I believe that Google now prefers sites with a much higher percentage of "valuable" pages. If you don't meet the percentage value determination, whatever that is, you get whacked.
I tend to agree with Fred. This is what I am finding as well.
[edited by: tedster at 8:00 pm (utc) on Feb 26, 2011]
Don, you're talking about rankings, I assume - not PageRank. My first thought is that your site might be caught in a domino effect because pages that link to your site were affected. For some reason, it seems like outbound links from the top targeted pages now have less value than they used to.
I still don't have 100% definitive data to establish this domino effect, but a few people I know have also noticed that it seems to be happening.
It almost seems like Google has expanded the number of indices that it uses to generate results by an order of magnitude or two. Remember when there was a Fresh and a Non-Fresh index? It's like that again, only the indices are now Similar Content Index #0000000001 to Similar Content Index #10000000000, or whatever -- millions of much smaller indices, each of which contains pages that are similar in some material respect.
In essence, my guess is they've spent a ton of time working to identify very similar pages such that they can avoid presenting multiple pages in the top results that essentially cover the exact same ground. The top results can be more diversified in this respect, essentially leading to better results and crushing sites that copy content or don't add much new to a given topic. That's been a stated aim for some time.
If this were the case, many sites with good content that was not materially different from other pages on the the same site or on on peer sites would drop in the rankings -- this would account for "my site is very good but I still got whacked by the new algo" comments in this thread. It would also account for some of your pages getting whacked if you have a good site, but others not getting whacked.
This is admittedly just a wild guess on what's happened, but I base it on a couple of things.
One, I spent some time getting into the heads of a few senior Google engineers, reading their posts on a bunch of sites, looking at books they were recommending, etc. There's a lot to suggest that they've been spending a ton of time on similarity algorithms and content clustering, as well as inference and learning algorithms.
Two, I see that my "site:www.domain.com" page count has shrunk drastically post-algo-change (by almost 75%), but if I do "site:www.domain.com widget" I can see many pages in the results that I believe are no longer in the site:www.domain.com page count. In fact, the count for "site:www.domain.com widget" is now greater than the "site:www.domain.com" count.
In this case, the widget pages are similar in many respects (still each one uniquely valuable to the world, mind you!), so it almost seems like they've been relegated into their own index or some other second-class index that spans other sites. If you've got pages that are very similar, regardless of whether you think they are value, this algo change would knock you down quite a bit -- because it's a new definition of similarity than ones we've seen in the past. It's looking for signs of similarity using sophisticated statistical algoritms, rather than direct verbatim plagiarism.
I'm pretty sure the new algo has a heavy weighting for identifying similar content, which the algo takes as the antithesis of original, unique content. That gets the scrapers but it also can get quite a few well-intentioned innocents.
The expansion of the index count is something I'm less sure of. If this were the case, the results would now be being generated from some hierarchy of indices, and if, for a given search, you didn't make it into one of the top indices relevant to that search, you won't show up in the top results. You might have great content on a certain topic but if some other site has it covered better than you -- or has equivalent content -- you could be low in the results.
I think the takeaway, if any of this is true, is to vary up your content fingerprints so each piece of content is as unique as possible relative to other content on your site and relative to competitor sites. If it looks formulaic or if it's not saying something new, this algo isn't going to like it.
Admittedly, all conjecture above...but what else have we got to go on than hunches? I definitely don't think this is as simple as devaluting internal links or any such thing...the Google engineers are trying to reinvent the game in a way that you really have to have differentiated and useful content to do well. So, er, bravo, I guess.
I dont think its extrernal - unless external links are saving me from oblivion. Where I have a choice - I have actively pushed phrase two and not phrase one - its slightly less competetive.
The client ecommerce site that I relaunched in September doesn't seem to have been affected in the slightest. Been pouring over Analytics, rankings, Raven Tools reports, and nothing seems to have changed (of course, it's early yet)
We're working on fleshing out product and category text, and writing like human beings, but I'd say right now it's probably 35/65 as a percentage of good content to thin content. No MFR content, the thin stuff is mostly just bulleted features which we put in ourselves.
We do NOT have much of a link profile on this B2B site. I've had control over this site for 14 yrs, and I'm not much of a linkbuilder; about all we have is a couple dozen paid directory inclusions on "Niche Industry Vendor/Supplier" type sites, which we bought for exposure and traffic, and most of which I think are nofollowed. Apart from that, most of our inbound links are actually on people's intranets, so not visible to search engines. We do get scraped a fair amount, but so far as I can tell, I've never found a scraper showing up above us.
The site was built on a Magento platform, and I spent six weeks writing custom titles and meta description tags for over 3000 pages. So *those* are unique.
Search (organic & ppc, Google, Yahoo & Bing) makes up about 53% of our web traffic; this company has always been a catalog company, and we're doing a healthy amount of email marketing now too, and dipping our toes in social.
I dunno if any of that helps, but add it to the pile of data.
|...the new algo has a heavy weighting for identifying similar content, which the algo takes as the antithesis of original, unique content |
Interesting hypothesis. That would be consistent with the drop in rankings of a site like ezinearticles.
Is anyone familiar enough with some of the other major sites where data is publicly available to judge how well this hypothesis holds up when compared to those sites?
For instance, there has been discussion in the thread about hubpages compared to squidoo, and ehow compared to mahalo.
Not that I think this one factor alone explains the algo changes. I'm with Tedster in thinking there are probably many different things going into the recipe. The basic change is a focus on quality rather than exclusively focusing on relevance.
To me, the most interesting questions are what does Google mean by "low quality," what data (recipe ingredients) are they using to detect it, and to what extent is "low quality" being measured or evaluated on a site-wide basis, rather than looking at each document in isolation?
Econman, I'm not sure we'll ever know all the quality factors that come into play, but, off the top of my head, I'd be asking questions like these to assess quality:
What percentage of your site's pages have content that is identical to content on other sites for which that same content predates your content?
What percentage of your site's pages have content that is very similar to content on other sites for which that same content predates your content?
What is the average ratio of original content to non-original content for pages on your site?
What percentage of your site's pages appear to be dynamically created from a database with similar phrasing, or slight variations thereof, on every page?
What is the ad space real estate to content real estate ratio on your pages? Is it too high?
Is your site overoptimized, suggesting that you may be more about SEO than good content?
Do you have reader comments on your pages, suggesting that they find some value in your content? What's the sentiment of that content?
What percentage of pages on your site have some proof points of engaged and interested users who appreciate your content?
Are your related links to other internal pages hand-crafted and relevant or do they appear to be computer-generated to randomly spread link love to other parts of the site?
Do you have footer links that might suggest you are selling links? Are there footer links on other sites that might suggest you are buying links?
Does your site have quality signs that suggest that it is maintained well and updated often with fresh relevant content or is it static?
Is there evidence of stub content – questions asked with no answers given?
Are visitors returning to your site?
How long are visitors staying on your site?
Do they leave your site and do the same search again?
Probably lots of others...
On sites that notice a drop in google traffic, what Is googlebot activity, is it the same as it was before? Mine is unchanged, basically googlebot caches our homepage at least 20 times a day. I rarely see a cache date older than 1 hour, most of the time less than 30 minutes.
Googlebot activity should remain constant because it is an algorithmic change so if GBot activity would not remain constant Google would not know a site fixed the problems.
|If this were the case, many sites with good content that was not materially different from other pages on the the same site or on on peer sites would drop in the rankings -- this would account for "my site is very good but I still got whacked by the new algo" comments in this thread. It would also account for some of your pages getting whacked if you have a good site, but others not getting whacked. |
Interesting theory, as are many theories offered over the past few days.
Many theories, like Whoa's, assume that Google is dropping the rankings for a site that just isn't as good as the others that were ranking nearby, and that the drop in rankings was more or less benign rather than malevolent.
If that were the case, why would pages that were ranked at #5 to #10 drop all the way to #40 to #70 or, in some cases, disappear from the index entirely? (I have a couple that have gone missing). There's simply too much garbage between the #7 spot and the #70 spot for Google to be saying, "after careful consideration, this is where we think your page belongs, and these other forty or fifty sites are ones that we deem to be better."
If there isn't a return to some bit of normalcy after a few weeks, then these kind of drastic drops would appear to be more like penalties than just adjustments. A site that, like so many here, has been on the first page for years or even over a decade can't suddenly be a "bad" site deserving of punishment, after having been considered "good" for so long. There's been too many updates for such a site to have not been affected before.
I don't think we're going to know for sure if our sites are really "good" or "bad" for at least a couple of weeks. It's been that way in the past.
Now for more wierdness. I happened to look at analytics amd my internal stats and it looks like I had an almost recovery an hour ago, that lasted for about a hour - although I can't really be sure - as the SERPS dont verify this and I was actively checking them out at the time.
I noticed I had a new #1 keyword phrase,
The phrase was my #18 phrase - just before the algo update and used to rank at position in the SERPS - the phrase is contains a product name and type and is in the plural. I used to received about 10 to 15 referals from that phrase.
It dropped out of the 1st 2 pages on Friday but on Saturday returned to position 4.
I went to recheck it because analyticssiad I had 20+ referals from it today and was wondering if Id gone up in the serps.
But I'm in position 18. In #4 position is "a how to use said product" article, which is kind of wierd as the search phrase is just the product. Strangely enough position #10 is held by Squidoo - but that is actually relevevant as someone has written about their home made products
For the most part when I've run into eHow - I usually get some relevant info - but they have managed to take the "How To Pour Milk" article that Tedster posted a link to somewhere and and turned it into a "How to pour milk from the dairy named #*$!#*$!x" article.
Everything I am seeing is just so confusing.
While that may be true, I would submit that High Quality Pages is somewhat an illusive description seeing as all my pages are unique and very high quality as far as what they offer. All my sites were built from scratch and aren't one of the cookie cutter site machines types. So I am still struggling with the concept, and don't know clearly what exactly to do to regain my ground at this point. Thanks for your reply to my post.
Gogglebot traffic is not the same for me. Up until this month, gbot was coming from 4-5 IPs. Starting in feb it started coming from at least 30 IPs in their range. Since feb 24, around a third of those IPs stopped shwing any kind of activity
Quality pages? Meh.
OK I rarely post here but am a long-time reader. Here's one experience I have with the new algorithm and something which makes me think this can't be *it* and there's more to come.
Some time early in Dec. I posted a couple of articles on one of the sites people are referring to as content farms, just as an experiment. They ranked 'ok' for the search terms I targeted but were basically hovering around the bottom of page 1 or page 2 for most of them.
So, as an experiment I pointed some backlinks to some of those articles from a few of the usual suspects for gaining quick ugly links (certain article directories, dofollow social bookmarking sites etc.)...junk, basically.
The pages with inbound links dropped as low as page 3, none of them went up. Fine, I wasn't really expecting them to to rise but it was certainly interesting (and hopefully coincidental) that they dropped.
This algorithm update made all those articles disappear for virtually all of the keywords I was watching.
I'm not drawing conclusions that it was those links which killed them off because a lot of the content on that site dropped also but here is something interesting...
...the junk pages I created on those bookmarking sites, article directories etc. rank from page 1 to 3 out of on average 2 million results for their keywords. That's right, the useful pages that garbage links TO are virtually non-existent, yet the garbage itself (which is in the majority of cases is a page with just a line or two of text I wrote and a pile of ads) ranks anywhere from the first page to page 3.
Those types of pages are not useful to a searcher and there's no way they could possibly be considered to be high quality in any way, shape or form, yet there they are and gone is the actually useful article they link to.
Bizarre. I really have no clue what G. was really trying achieve with this update but quality results don't seem to be it.
On a side note: expired craislist ads are worth ranking too? Sheesh.
A couple of people have mentioned pagerank. On that note, I thought maybe I'd throw these thoughts out there. About a week ago, Google updated TBPR. And yes, we all know that TBPR is merely a snapshot of some time in the past, and is no real reflection of the current state of internal PR. Let's just stipulate that and move on. Anyway, in that TBPR update, one of my sites got whacked from PR3 to PR0. This is the absolute cleanest, couldn't possibly be penalized site I've ever created. Nothing but useful content. Users love it. Users share it. Users link to it. Nothing even remotely shady or indicative of any paid links (there are none, of course). I of course am wondering why it got hit with a PR0, but at the same time, I'm starting to wonder if perhaps there is some loose correlation to the Farmer's update. The site that got PR0'd did not get hurt by Farmers. It still has extremely good rankings and traffic. Another site I own did get hurt by Farmers. Now, the two sites are completely unrelated and don't link to each other, so there's no direct relationship between one losing PR and one losing rankings. I get that. But even though those 2 sites aren't directly affecting each other; perhaps; just perhaps; something that happened to sites during the PR update had some sort of effect on other sites in the farmers update. I doubt it...but maybe. Again...not likely, but I don't have anything against throwing all ideas at the wall to see if any of them help us learn. So there ya go. Another thought thrown at the wall.
On the surface it seems it seems to be an improvement but once you start clicking through that illusion seems to disappear.
PS if you find an article on how to pour milk form a guernsey cow - please let me know as I'm only used to jersey milk.
Interestingly, we made some serious speed improvements to a site of ours last month (about 100% speed improvement). I had expected to see the speed at which Googlebot was crawling our site improve but it actually got drastically worse.
Then low & behold, as soon as the new algo update rolls out Googlebot is blazing through our site at pre Jan speeds.
Looks like they were using some pretty serious resources to roll out this new algo change, which makes me wonder about how easy it'll be to tweak/correct in the coming weeks.
|I'm pretty sure the new algo has a heavy weighting for identifying similar content, which the algo takes as the antithesis of original, unique content. That gets the scrapers but it also can get quite a few well-intentioned innocents. |
I have no better ideas than any others on here (if that) -- - but I want to point out that the post by Whoa is, in many ways, (not all) a description of my site, which dropped the typical 40 - 50% traffic on Thursday.
My content is all original, it's well-researched, but, since 85% of the pages on my site are database driven "entries" almost like a dictionary, it's quite possible that the overall look, to a machine, is one of a bunch of "similar" pages. I don't think that's the whole story, but I think Whoa's comments ring very true to me. They point in the right direction, I think. (Or at least one of the right directions!).
Yep - some of the major targets Google hoped to clean out were cleaned out. The collateral damage may well be much larger than they thought would happen. If so, this is not going to be the last version of their algorithmic look at content quality. And if, as it seems, this new algo component was in the works for more than a year - then it would be folded, sliced, diced, and remixed in all kinds of ways well into the future.
Donna - I'm thinking that TBPR is just out of whack. One of my old old sites - like, the domain has has been around way before Google was - got sent down to TBPR0 about nine months ago. Never lost traffic or rankings (which weren't much) I figured it was maybe cause I had a Google rant on it. Anyway, the PR came back and then some this last update. I think they're just messing with us.
I notice if you mouse over a PR tool in the Google Toolbar, it still tells you that PageRank is Google's view of the importance of the page.
netmeg, obviously, i'm not terribly concerned about the PR0 issue since rankings and traffic are excellent. However, it does kind of sting to have people think that Google has a low opinion of my site if they believe that "PageRank is Google's view of the importance of the page". So, yeah, it's more of a personal affront than anything. And like I said, I doubt it has anything to do with farmer's, but if one thing is out of whack, perhaps that one thing is affecting other things in a way they hadn't planned for. I dunno..maybe...probably not.
Another observation: a major company has launched a revolutionary product in its field a few weeks back. The company has domain.com and the product is called the domain dd. Some of you will probably figure out what it is i'am talking about.
If you search for
the company's product info page is on the second page of the serps.
Imagine the level of seo you need to do, to be able to outrank domain-name plus 2 other letters. It can't be all white hat when domain.com is PR 6+
Yet, google's latest algo seems to like those sites...
In general do you think that "Directories" have been whacked in this update? I see sites like business.com have been whacked but wondering about other directories such as Industry specific one such as www.thomasnet.com or even real estate sites such as www.trulia.com which really just display the MLS real estate data in another form. Anyone seen any data on this beyond the TOP 25 list we have all seen?
|My content is all original, it's well-researched, but, since 85% of the pages on my site are database driven "entries" almost like a dictionary, it's quite possible that the overall look, to a machine, is one of a bunch of "similar" pages. I don't think that's the whole story, but I think Whoa's comments ring very true to me. They point in the right direction, I think. (Or at least one of the right directions!). |
That may be part of the story, but most 'content farms' had decent size articles and definitely not stubs, nor can you say that their sentences aren't complete. Stupid and useless content can be very good grammatically.
We also have had this debate for ages: at what point does the google's duplicate penalty kicks in, and talked about too many tags (brings same short stubs), large footers etc.
Collateral damage is expected with such a "big" algo change. Plus weeks of tweaking, filters added, etc. Perhaps they are studying what the results are right now and will add their adjustments based on what the are seeing.
The biggest red flag for my site has been this. So much scraped content that I had never ever given permission to copy. So I've sent Google my slew of spam complaints and DMCA requests.
I assumed though, that Google's dup filter already took care of this, but I am guessing that's not the case. Something is weighing down my site and I'd love to know what it is!
On another note, I would love to know enough about what they are expecting as a "high quality" site so that I can get as far away as I can from being collateral damage in the future!
I scratch my head because while I've made my mistakes in the past with minor ventures into SEO, I've since wiped out those errors. Others, I can't help, when I did experiments with link backs and so forth. I always thought such links would just be simply devalued, not cause a penalty. But I guess I am wrong. The question is, what can I do to ensure a squeaky clean image to Google's eyes?
I see many many many colleagues breaking the Google law with paid links/buying links (I no longer do this ever) and intentionally buying links in the spammiest sites ever, still they stand. Ironically, I have been cited in online sites like CNN, MSN, Yahoo, and those links don't count. Must be that the exterior link profile is not counted here. The focus has been content.
It just seems horribly unfair to those of us who believe we've done the best we can to adhere to truly white hat practices. I did not spend 5 years of my life building my one site from scratch to be broken down this way.... a drop so drastic, if permanent is a massive blow to take. I could have taken a gradual loss of traffic, but a sudden one -- much like the tornado that hits you while sparing others...
Hopefully it will pass and we all learn once more what it is that G wants. Again, am truly not sure at this point what it could be that I've done to trigger this penalty. Except that my site may appear like a "content farm" like a mini EZA site that allowed anyone to lift articles for syndication. Not the case here, I am a content thief victim and I'm being punished.
More DMCAs going out from here!
Lazycat, I can attest to your experience also. I've got content that was copied by a well known commerce scraper site and they sent a link to our site... our site is now gone and they are ranking on the first page.
What's interesting to me about this update is that affiliate websites with very low level content seem to have been untouched in the same genres as larger websites.
This tells me that the update was focused on websites which do not target a single niche, regardless of how "granular" the taxonomy in the website is.
falsepositive, a normal level of reciprocal links will not cause a penalty - unless maybe you have thousands and thousands, and built out your links section into a directory with lots of pages.
tedster - would it not be a different strategy for G to whack sites by also cutting out the Adsense / Adwords traffic in addition to throttling the search traffic?
In my mind the Demand Media and other scrappers would dry up in a hurry. Lots of traffic but no Adsense / Adwords access would bring stability and ethics to the fore.
Google goal is to deliver a smarter, better search appliance.
To achieve this, they employee and continue to hire the brightest individuals available.
Their management team is comprised of a very talented group of people.
Even though they have a brain trust of sorts they are missing some simple concepts.
A smarter search engine would first ask for the users intent of the search, and then ask for the search query. Better yet, provide a graphical user selection process for the user to navigate to their final destination.
Even with google's selection of Image, Videos, News, Shopping, etc . . . they are mostly one dimensional in their approach for delivering search data to the common user.
Google continues to modifying their relatively flat algo in the goal of increasing the satisfaction of the users experience, rather than better addressing users intentions.
Also, when you really think about it, doesn't the "search box" seem rather passe?
|On sites that notice a drop in google traffic, what Is googlebot activity, is it the same as it was before? Mine is unchanged, basically googlebot caches our homepage at least 20 times a day. I rarely see a cache date older than 1 hour, most of the time less than 30 minutes. |
Today Gbot got about 50% of my pages and is still going. Was slower (25% of pages) for the previous two days and I think that getting 50% of pages is faster than usual.
I never noticed more than a new cache a day, but I get a fresh tag at least once daily.
It's common for the Googlebot to crawl sites at an accelerated speed after an update. This has been happening to my site today... part of the day the crawl rate has been over 5x the normal rate.
I don't think this is outside of the norm. Actually, referrals from Google have been more consistent hour over hour than it has been in weeks.
On a different note, Analytics reports that traffic has been down for my site Saturday because of a reduction in visitors from London. I hope this means that the change has rolled out outside of the U.S. now and I haven't seen much of a difference.
I'd love to hear from others regarding what their Analytics Intelligence reports and if they have seen a change in the number of referrals from outside the U.S.
| This 245 message thread spans 9 pages: < < 245 ( 1 2 3 4  6 7 8 9 ) > > |