Without starting a conspiracy rant, what is the technical reason for Ehow going up when they have the same basic model as all the sites which took a spanking.
Ugh, that last quote wasn't so good, was it? :)
|what is the technical reason for Ehow going up |
That's a great research question. Wish I could see the answer right now. Clearly the algo is not looking at business models, but rather pages and/or websites. So what does Ehow publish that looks like a positive signal to a machine?
|what is the technical reason for Ehow going up |
Some might say eHow is a winner because they use AdSense. But many other sites that have been demoted use AdSense so that theory does not stand up.
Could it be their inbound links? Tthe articles are written in a how-to format which may create better context for the inbound links. Just my guess.
The articles are not always written by experts, they are generally written by writers. But because they are how-to articles, those doing the linking may not be the best judges of authority but they do so as a vote for the quality of the content and they get the link.
Is this a flaw in the new algo if those citing the page aren't necessarily the best judges of quality?
So why did HubPages lose but eHow didn't? HubPages content is not generally how-to in nature. They're about topics. eHow pages are about how to do things, which potentially set up different kinds of backlink contexts. Could the context of the backlinks be playing a part in deciding winners and losers?
Why was hubpages targeted? I tried to get a feel for how community sites and pages link to each. According to Yahoo, more phpbb sites link to it than do ehow. More wordpress sites link to hubpages over ehow. More vBulletin sites link to hubpages, too. I ran different searches in Yahoo to find differences between the two sites but HubPages seems to beat out ehow in all the backlink searches, even for backlinks modified by the phrases "Great Article" or "Check this out".
So I circled back to article topics. One interesting difference where eHow did better is in a backlink search for sites linking to hubpages and ehow. Backlinks to ehow with the words "SEO" anywhere on the page were around 90,000. Backlinks to HubPages with the words "SEO" anywhere on the page were over 350,000 links.
[edited by: martinibuster at 9:32 pm (utc) on Mar 2, 2011]
|So why did HubPages lose but eHow didn't? |
Well lot's of well-written and well-backlinked hubpages are still ranking, it's just those which had no backlinks and were relying on domain strength which have fallen back.
There is another factor too - in recent years hubpages was a target for link-seekers, who posted machine spun stuff (to by-pass hubpages duplicate checker) in order to get a link. Now the active hubpages community does seek these pages out and report them, and hubpages closes them down - but the active community is much, much smaller than those who post and run, and they were and are fighting a losing battle.
The machine-spun stuff often doesn't read well, it's very badly spun. If you don't believe me, get a hubpages account and then do some "hup-hopping". If G are using linguistic indicators, then this hurt HP badly
EHow doesn't have a problem with machine spun stuff, because they don't have people just freely submitting stuff to them in return for links. Squidoo should have had a problem, but because everyone seemed to think they were still slapped from 2007, the spinners left them alone. Ironically the reputation of having received a Google penalty is the best way of keeping your site clean as the fly-by-nighters leave you be.
I know this is going to make me look dumb.... I'm used to it.
eHow is doing some jinky stuff on their ads and related ads links on their site. 99% of the links on any page are jscript-nofollowed... including ALL of their own onsite links. But there is always a small subset of links, VERY keyword focused (2-word--golden-phrases), in the sidebar that are followed to an ad page on their site.
Obviously, I cannot link to a page... and I realize that all the followed links are scrubbed through a tracking mechanism... but "Why" would all but that small set of VERY keyword centric phrases, just 8-10 out of the 120 or more links on a page, be followed to a page with nothing but a single block of 10 or more google adsense links?
Obviously... its relevant for some reason.
|So why did HubPages lose but eHow didn't? |
Maybe more complex page structure is a factor?
The content on eHow pages have more lists, pictures, different sections with different formatting,and comments. Hubpages are submitted by users, so do vary, but in general are much less formatting.
Squidoo pages are generally more complex also, and didn't suffer the big hits.
Could be one of the signs of "quality" they look for.
One thing I've noticed is that while content farms are down, made for Adsense mini-sites are up. I guess that will be the next big algo shake up down the road.
It is too bad ehow didn't take a hit. Someone said Suite 101 took a big hit. I actually thought many of those articles were pretty good.
|So what does Ehow publish that looks like a positive signal to a machine? |
Nothing. Plain and simple, it's favoritism through manual placement. No analysis required.
It is interesting that top losers have a high percentage of loss, whereas a top winner has "only" gained just under 1/4 of the top loser. To me this indicates two things:
1) Aggregately, there are many more winners than losers
2) The new "Quality score" factor(s) may be more a "demotion" rather than a "promotion", i.e. it could be a negative factor for a bad quality site/page rather than a positive factor for a good quality site/page
I am thinking along the lines that there could be more than one "Quality score" factor in play, with one of them perhaps being weighed as a negative score, whilst other(s) being weighed as a positives, but the negative factor having a bigger scoring impact
Guys the one thing ehow has that all the others lack,
[edited by: scooterdude at 11:23 pm (utc) on Mar 2, 2011]
gizmag and gizmodo are way up too, at least in my niche.
Honestly I think this was a hand-tailored hit-- whatever new algorithm they've added/tweaked was trained on data for specific sites, and I think the characteristic for the primary targets is that their business model was harnessing average, non-expert people to write content for free (or very little).
HubPages CEO was on Techcrunch a few months ago blaring to the world how awesome they were at SEO, suggesting titles based on adwords search data, etc. YOU DO NOT DO THAT.
AssociatedContent, Mahalo, etc. all fall under this.
EHow doesn't get hit because EHow has paid writers. Their pool is actually pretty small. They are technically a newspaper in this approach. AnswerBag, another demand property, DOES get hit.
Squidoo avoided the pain here, although they too should have-- ideas above about them already being hit are interesting.
|Aggregately, there are many more winners than losers |
Yes, that jumped out at me, too. But I'm not sure how much analysis I can extend from that. Here's a try.
Apparently the new algo is essentially a page-by-page analysis above all, but with some site-wide scoring derived from the individual page scores.
So when one site loses a lot of separate rankings, a number of different sites will usually move up to fill in those gaps. Hence many sites won small, and they each show a lower net gain than the number of sites that lost big.
I'm thinking that scenario might indicate that sitewide promotion isn't a part of this algorithm, but sitewide demotion is. In fact, Amit Singhal verified today that some poor content can hurt the whole site. Page-wise,both promotion and demotion scores are in play, however.
Does that make sense?
I'm thinking that spring cleaning may be in order for the weakest pages. Either improve them or drop them.
Here is my observation:
I feel I can give a decent insight as I work with some huge content sites and a few thousand smaller content rich sites)
1) Duplicate content (within a percentage uniqueness) is being demoted.
2) Thin content is being demoted.
3) unique content is being promoted.
4) A different algo is being used to determine the attribution of the original content.
Note: Seems like links have nothing to do with this update.
The current problem that is being fixed:
Determining the source of unique content.
So made simple you could see how if they attribute content incorrectly or identify "thin" content incorrectly then the whole problem results in what has happened.
You can tell it is this way as eHow is still there due to the huge volume of unique content.
By the way I mean unique as Google not being able to spot similar articles - not it being actually unique.
Article spinners worked properly will get past this as it is actually that far different from the original (because it is garbage) that it ranks.
Human written content could be a problem in this update as I think Google is matching common synonyms and maybe other factors at a simplistic level giving false positives.
It is as if they have increased the tolerance of their synonym and related content technology to try to recognise duplicates - as well as trying to better find the original.
We all know that there is no algo that can work out the above.
At some point Google is going to have to put their hands up and get some help from human beings to be able to deal with language and content.
On my end - huge spam works better than ever, only good guys are getting hit from this which means all those "don't break the unwritten rules because in the long term it is not a good business model" are a little bit of bad advice.
I have looked at a few threads about strategies and even building sites that link to you - and there are still the people that say "don't manipulate Google it is not a good long term strategy"
This update has made it clear that Google itself is not a good long term strategy.
My real point on this:
You could have been totally "black hat" and spammed the heck out of Google and made millions but you chose the "ethical" route.
You just lost all your traffic in this update. You thought you were doing the right thing as Google somehow "likes" your site.
Nonsense. Should have, could have.
And now, any "white hat" is a target - maybe not in this update, but maybe the next one, or the one after that.
I am clear - I sleep at night because my strategy outlives Google.
Anyone who depends on Google whether "ethical" or "black hat" is on a limited path - the only difference is the "black hat" earns a ton more cash and probably withstands a Google update better because they use automation to create more sites than the other guy.
|only good guys are getting hit from this |
Not true. I visit several blackhat forums and have seen many dozens of posts about getting wiped out.
tedster, yes that is totally a way to combat the problem.
Sites are now being scored in addition to pages - it is as if they take a percentage view of what they deem unique and then score the site/domain on that.
The whole problem is the scoring.
A site like eHow will always score well as they have so huge and amount of unique content (at least unique to an algo).
Any site that relies heavily on re-publishing even if they have a large amount of unique content could still be scored poorly using that algo.
I have seen a big uplift in traffic across all our sites (over 5k) and that is because the percentage of unique content is high - no re-publishing and no RSS etc.
trakkerguy, of course that was a generalisation.
But if you are listening to complaints from people on black hat forums then you are listening to small time guys that probably don't know what they are doing.
Which decent blackhat says "oh my god just lost rankings"
Only a muppet does that - and that is to be fair what those forums are filled with.
For example I know a lot of black hats like to do "tiered linking" which means you create Web 2.0 sites on Wordpress, Blogger etc and then hit it with XRumer or stuff like that.
Well in this update - that has gone.
That is one of the things this update has tried to fix. Great stuff, however the amount of collateral damage is high - which is interesting as all those spammers get wiped out at the same time as the clean guys.
All the clever spammers just do something different and move on.
All the genuine guys get wiped out and struggle to pay the bills.
I don't like that.
Swanson - I agree it does seem to have hit more quality white hat sites than it did black hats.
And yes, there are a lot of "muppets" on the BH forums. But there are some private forums and chat groups where some sharp operators also got wiped out.
I just thought it was an important data point that some of them got hit also.
|In fact, Amit Singhal verified today that some poor content can hurt the whole site |
Interesting, but of course the big issue here is the definition of what poor content is.
And on the flip-side developing a best practices around what great content is
Are there particular characteristics of quality content pages themselves that accentuate and push better signals for Google? Citations, references, Date Modified?
Profiles attached to content that cite other authorities? Hubscores and Co-citation.
The difference being black hatter are used to being wiped out every algo updates or so and their strategies incorporate this reality. Not the good guys who toil for years on a site and invest on the long term survival of that site.
Digging further into that is the job right now. Knowing that there are both page-specific scores and also site-wide influences from those scores, is a good start at getting a closer glimpse of the mechanics of it all.
My current guess is that there are many factors being used. I envision that they started with a machine learning algorithm. Give it two seed sets as targets, one containing examples of low quality that they don't want to rank, and a second set of excellent content that they consider vital to rank.
Then let the learning-machine range over all the scores that Caffeine maintains, weighting and re-weighting the 200-plus scoring numbers that they've already got on file. As weighting combinations come up that are more and more a statistically significant match to the seed sets, the actual weights of those factors can be noted. Eventually you get a beta version that can be tweaked further.
Might be total fantasy, but I'm currently thinking the original research work was something in that direction. If it was, then we'll have a devil of a time trying to reverse engineer it. But through trial and error testing, we may begin to see something of its outline.
Then there's also a chance that something totally new is also in the mix - some form of linguistic analysis that was never included in the algorithm before. A newbie scoring factor wouldn't be allowed to carry the whole thing, but it could be in there, too.
After all, measuring Quality is a new game, where measuring Relevance is relatively mature. A new factor or two might be needed for the new game. And we've already had public comments that Quality and Relevance measures tend to conflict with each other.
After a relatively brief (2-hr) look at the sites noted... I feel that apart from the quality and uniqueness of the content, the top winners are clearly much better structured and conceived overall than the losers are. It absolutely jumps out at me.
< moved from another location >
Here is a great article from Search Engine Land regarding the winners and losers from Google's content farm algo update:
More Farmer Update Winners, Losers: Wikihow, Blippr & Yahoo Answers [searchengineland.com]
Search Metrics published there Organic Performance Index (OPI)report recently and they have a list of winners and losers based on the update. Question, why are low quality content sites like Yahoo Answers and Wiki How gaining because of the update? Aren't these sites considered content farms?
[edited by: tedster at 5:47 pm (utc) on Mar 3, 2011]
Thanks for the link. This is what happens with a computer-based algorithm. It just can't match a human idea perfectly. Heck, many humans don't even have the same idea of what a content farm is. That's probably why Google was careful not to use the words "content farm" when they announced the update.
If yahoo answers is one low quality UGC, youtube is another one.All it can rank is for videos and definitely not for its content.Youtube does seem to have gained heavily by this update.The only think in common between these two is "lots of related content".
If we consider this to be the farmers update, then I don't see a reason why these real farms, with huge internal links should gain while ezinearticles and others fall.
People talk about some quality factors being introduced in this update, but what is QUALITY?
When youtube and yahoo answers can rank better, I really wonder when people talk about grammar, incomplete sentences etc. being a quality factor used in this algo.
Can someone pls. throw more light on this.
There are three sets of serps I am seeing now. The farmers algo (is it?) runs the US SERPS and there is another one that runs elsewhere (country specific TLDs).There is this third one that I see on google.com and this still retains the SERPS before the update.
Does anyone else see this?
Here is one interesting test I did in the past few days.I tweaked the title for an affected page and then reversed it.The cache is showing the reversed title, while the SERPS still show the tweaked title.It has been like this for two days now.What is happening here? Can anyone guess?
ps: Someone mentioned that gizmodo gained in this update and I noticed it too.
I was just going through the list and another noticeable common thing about the top winners is they are mostly "How to" sites.wikihow, ehow, instructables, howstuffworks, etc.
Does a larger percentage of "How to" titles help in hoodwinking this algo? Or is the algo having a soft corner for sites with lots of pages that answer to "How to" questions.
I would say that yahoo answers too has a higher percentage of "How to" titled pages.
|WikiHow is ranking No.1 on Google with articles about “how to eat a banana,” “how to eat a sandwich,” and “how to make toast.” |
Good to know.
Now THAT is quality.
Add Craigslist to the site of winners with low quality UGC like youtube and yahoo answers. However, I think there is more to looking at sentence structure and perhaps "usefulness" factor gets into play or the "social" factor.
| This 73 message thread spans 3 pages: 73 (  2 3 ) > > |