|Matt Cutts and Amit Singhal Share Insider Detail on Panda Update|
Senior member g1smd pointed out this link in another thread - and it's a juicy one. The Panda That Hates Farms [wired.com]
Wired Magazine interviewed both Matt Cutts and Amit Singhal and in the process got some helpful insight into the Farm Update. I note that some of the speculation we've had at WebmasterWorld is confirmed:
Outside quality raters were involved at the beginning
|...we used our standard evaluation system that we've developed, where we basically sent out documents to outside testers. Then we asked the raters questions like: "Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?" |
Excessive ads were part of the early definition
|There was an engineer who came up with a rigorous set of questions, everything from. "Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?" |
The update is algorithmic, not manual
|...we actually came up with a classifier to say, okay, IRS or Wikipedia or New York Times is over on this side, and the low-quality sites are over on this side. And you can really see mathematical reasons. |
|Sounds like they are looking at the word count to ad ratio. |
It could be screen usage, too. Google has been using visual page simulation for a while - their "reasonable surfer" model leans on it, and it has modified the way PageRank is calculated.
Last year someone had a penalty reversed because an iframe generated a false positive for their "too much white space" metric. It was documented on Google's own forum, and JohnMu got involved to place a flag on the site, in case it ever triggered that penalty again.
So it could well be more than just word count.
I'm not so sure if this is about the quality of writing but more about how a website is perceived by the masses (although it looks like they got a lot of that wrong).
|In My Opinion: If you're looking for 'the one thing' that's the cause of the drop, it's the web page ... Don't worry too much about the source code, look at the page in the browser window ... The answer is right in front of you... |
I wish I could see it.
As an example, one of the biggest hits I took was for a low-competition keyword. My page was topped by three entries from two keyword-domain-name websites that don't answer the query with any authority or completeness (in other words, they're rewritten, vague, wordy meanderings around the keyword).
Since mine was based on personal experience and also had 100 commenters offering theirs, mine does answer the query, as does the other result (call it site 3) above mine.
Admittedly, the two keyword-domain websites had floated to the top BEFORE the update and now came in stronger, showing for more keywords, and with extra results. They have a cleaner aesthetic than site 3 or mine, but far less information and absolutely no authority - in fact, the top site removed content after I'd filed a DMCA against them for plagiarism. That same website changed all its content - rewrote it all - and it only helped them move up more. They are the "low quality but not spun" kind of content that I think Google was trying to target. My page has flaws; it could be made more useful; eventually other pages will be more useful than it, but right now, it and the result above it are the best out there on the query, at least that I've ever found.
I look at all the pages in question and have a few clues, but it's nowhere near obvious. The two domains and site 3 that overtook me have a more classy aesthetic, fewer ads, more white space. That's it. That doesn't seem enough by any means. Those would be awful signals of quality ON THEIR OWN for the Wild West Internet we're in right now - like expecting the general store to have a fancy facade. Not practical if they're busy getting the job done and don't have extra funds.
The source code IS revealing some patterns, but I need to think about it some more.
|and also had 100 commenters offering theirs |
They have a cleaner aesthetic than site 3 or mine
My page has flaws; it could be made more useful;
The two domains and site 3 that overtook me have a more classy aesthetic, fewer ads, more white space.
You saw it and said it yourself, imo.
buckworks - I saw you mention pubcon - is it really $599 to attend?
Yes, that's the price if you register now. It will go up if you don't register until you get there.
Full info here: [pubcon.com...]
I haven't missed a Pubcon for several years.
You really think that it was simply the cleaner aesthetic? That's it? That's the secret, enough to top my 100 comments, brand poll, links to authoritative sites, and personal anecdotes? That's not consistent with a couple of the sites I looked at on the Google report-your-unfairly-demoted-site-here page. Many looked horrendous, but some looked actually quite nice and were not over-complicated.
As for flaws, I guess I wasn't clear. Naturally, any page can be made more useful - I have yet to see a page anywhere that can't. But right now it's probably the most useful single page on the topic. Site 3 was older, though, and offers value, so I bow to their seniority. But another site for that query that IS an authority .org site, predates mine, and touches briefly on the topic, and that used to be on page one, has moved to, like, page three, which seems crazy to me.
So what I'm saying is that my page isn't perfect, but Google isn't looking for perfect, but the best results for the query. And if site cleanliness ALONE outweighs content to that degree, then that would be more naive than I think Google is.
I think they're looking at the document footprint and what you describe sounds 'spammy' even if it's not ... 100 comments could easily change the type and level of writing on the page, or create a 'fuzzy subject' to an algorithm ... A brand poll could algorithmically look like an ad block ... Don't look at the other sites and try to figure it out, because it's going to be page and then site specific ... Look at your page and it's footprint as an algorithm could see it as and then look at your site as a whole...
You know what's on it and what it's about, but an algorithm has to use a pattern match, so you have to find a way to make your page not fit the pattern of 'lower algorithmically perceived quality' that it fits now, and yes, I think design is part of it. (Layout more specifically) They've been rendering a 'machine view' of pages for quite a while and I would be more surprised if they didn't use that in this system than if they did.
[edited by: TheMadScientist at 6:06 am (utc) on Mar 4, 2011]
|It doesn't matter if you like Yelp or not, that's not the point. |
Yup. The point is that yelp could block googlebot if they wanted to. In fact, they could make their reviews only available to people with a subscription (a la consumer reports) if they wanted to.
(Out of curiosity, would anybody pay anything to read a yelp review? I thought not...)
The thing is, as far as know, google isn't doing anything illegal. If yelp doesn't like it, then the brain trust at yelp should build a better search engine than google.
According to Matt Cutts:
I feel pretty confident about the algorithm on Suite 101.
I guess the secret to rank well after the Farmer update is to just do the opposite of what Suite 101 does...
Their word count is probably ignoring intra-site link text, alt text, footer text, comment text, etc., and trying to narrow it down to just the article vs the ads.
And, time spent on page.
As Incredibill said, as many ads as they have on their pages, they're hypocrites.
I do think they use aesthetics, and that they should - it makes a difference to user experience, particularly to the ability to navigate - but I can't see it being everything. And I don't WANT it to be the only factor, darn it all, because I don't have control of that. ;)
I'm pretty sure it's not the comments - they were moderated by me, and I don't allow spam links, and they're all spot on topic and decently written. The brand poll might be a problem, but it has thousands of responses and users seem to like it. Users spend an average of over 4 minutes on the page and over 40% return to visit again. The main reason I know the page is useful isn't my own ego, bloated as that often is - it's that I've received a number of "I'm so thrilled I found this site!" comments. (They think of it as a website, when it's actually a web page on a bigger site.) Whereas the top-ranking, two-entry site has a link to an ostensible discussion forum, where there's a header that says something like "35 messages" or whatever, but there are only two.
But again, those sites were already going strong before the update and only boosted slightly - the big hit I took was site 3 overtaking mine, and site 3 WAS a more simply designed, though not attractive, site. ( Had the appearance of a basic HTML old-fashioned website.)
NONE of the general-interest content sites are clean looking, including, bewilderingly, EHow, an algo "winner" that is as crowded as a mess hall at noon, with ads peppered all over, top to bottom.
Would be very interesting if design quality were really at the root of this...
Lapizuli An algorithm has no idea what kind of comments you've had. You're going to have to change the definition of what quality is to make an improvement, imo.
How are the other pages on the site?
Remember they can impact it too now...
It's easy to defend a site we put time and effort into, especially one other people like, and you can totally defend it to me or people here, but the algo can't hear you and doesn't get the explanations.
IDK what to tell you, because it sounds like you want to hear some complicated answer, but the picture imo is simple: It's a look at a page and the site as a whole to determine quality ... It's right there in front of us.
If you want to compare it to a site, compare it to WebmasterWorld ... How's the quality of yours compare with here?
[edited by: TheMadScientist at 6:52 am (utc) on Mar 4, 2011]
" where we basically sent out documents to outside testers. Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”
Cutts: There was an engineer who came up with a rigorous set of questions, everything from. “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines."
Can someone please get us the full version of THIS QUESTIONNAIRE?
How many of the questions above are NOT visually oriented?
Sorry, maybe because it's late I'm not clear on what you're saying...and I don't want to hijack this thread any longer with my stuff...I did think from your mention of "fuzzy subject" that Google uses its latent semantic indexing or whatever their techno-patent-voodoo is to read comments. But right now, I'm not really prepared to get rid of the comments or the poll, which are not what I'd call spammy. Other pages that suffered have neither comments nor poll. This happened suddenly, to a wide number of articles, to a wide number of authors, and so I don't believe it's something that's wildly different for each page - it's just too widespread. I think of it like I think of human beings - we all like to think we're very different and individual and unique, but we happen to be unique in a lot of the same ways as our brethren. ;)
Sorry, just saw your longer response. Okay, I see I wasn't clear. This is an article I have published on a "content farm" that got hit. I'm a writer, not a webmaster. It's not one of my own sites. My own websites are tiny and didn't get hit by the algo update.
I don't know your site ... I haven't ever seen it ... All I can do is speculate based on what you're telling me ... I think it's the visual representation ... As far as I knew until now we were talking about one page ... If your page(s) aren't ranking like they should then something has to change, either your page(s) or the algorithm ... They will be adjusting for a while imo, so maybe you should wait, but my guess is: The answer is on your site, and imo you won't find the answer for your site on anyone else's; any answer you find on a different site will be the answer to their issues...
(We were posting at the same time, and I think it's a good discussion we're having for everyone, even if it's a bit more site specific.)
Huh? It's not even on your site?
Hmmm... IDK how you could possibly fix that.
I could fix it if Google were considering factors that I CAN change. I can change quite a bit - the layout of the text, the modules, certain SEO elements, any of the article text and the title, plus possibly the approximate number of ads. But not site design, ad placement, coding or the other pages on the site.
So I'm VERY interested in figuring out if this is something that's out of my control - in which case, I and other writers should flock to our own websites like gulls - or in our control.
Edit: And I'm sorry to y'all for not explaining well. I've mentioned before that I publish both on my own sites and on third party sites, but I sure would love a place in our profiles where we can give basic background, too...
Two things to consider based on couple a of comments in your earlier posts.
1) You've been discussing a page within a site, not the whole site. IMO it is much harder to rank a single internal page for a competitive term than it is to rank the whole site. Single pages are more likely to be surrounded by pages that cover another topic, and therefore the surrounding environment is not fully relevant. I've often wondered if that is a factor that counts against a single page, no matter how well it stacks up against the competition in the eyes of a viewer.
2) You mentioned 100 commenters. That is potentially adding a lot of content to the page and possibly fragmenting its focus on the primary search term. Lots of good content and discussion is great for human consumption, but IMO it can dilute the signals an algo picks up and uses to determine relevance for a given search term.
Just my 2c worth based on personal observations.
Algorithmic human testers, interesting. I'm sure glad no REAL human had any say in what was quality and what wasn't.
That's interesting about ranking a page. The biggest content farms do have the ability to rank standalone pages - or did, anyway, presumably because they can recruit article webs on lonely topic areas. Apparently that is fading, which means they'll change or die. I'm okay with change or death, as long as the little guy without the ability to build the store, so to speak, can still set up shop somewhere.
(I'm always pushing the flaky notion that we collectively need everyone - and I mean everyone, not just big corporations or brands, but everybody's grandma - to start making a living digitally as fast as possible, because I believe it's the switchover from old economic model (car-based) to new economic model (digital) that's the underlying reason for the worldwide economic downturn. )
Anyway, I do hear you and TheMadScientist about the comments and I appreciate both of your advice. I'll look back over them to see if there are any outliers and handle them. But truly, it'd be madness to dump one of the few, reasonable, non-argumentative, on-topic comment threads the web has going for it, that doesn't make anyone want to bang their head against a wall Charlie Brown style.
|The Shower Scene|
I concur incredibill, well said.
The more I think about this the more I find myself I'm irked a little. Let us assume there is NO WAY that the humans answering the questions each reviewed every site on the internet for a second. That would mean google took signals fromthe highly rated and signals from the lowly rated and applied blanket treatment to all sites that "fit those profiles".
If that's the case the single most important SEO factor isn't anything we try to do well, instead it's just avoid sharing factors with low quality sites. Some of that we can't control, like how many people follow you on twitter, but some we can, like using a custom CMS and having a custom css.
I am a bit surprised that people don't realise that big companies have units that compete or have different agendas.
When I worked at BT the individual engineering centres in the same division competed against each other this was quite friendly as opposed to those grade inflated %%^&* in Cellnet :-)
At one time our dev team ran a world wide intranet for BT Sales as the division that supposedly should do this could or would not do it.
|The thing is, as far as know, google isn't doing anything illegal. If yelp doesn't like it, then the brain trust at yelp should build a better search engine than google. |
You missed my point, forget Yelp itself.
WE ARE YELP!
We are just much smaller and even more helpless.
Google has proven they are being predatory on our data, Yelp and other review sites are just a typical example of how sites are having their data used and abused beyond their control to stop it if they want to stay in Google.
For instance with this update Google is stomping down sites that could resurface if they comply with the will of Google, and compliance is starting to get to the point that if you don't design a site they like, or even use their own AdSense product TOO MUCH as they insist you do with max coverage, POOF! you're penalized.
Sounds like bullying at a minimum.
I disagree incrediBILL,
I remember the days before Adsense and believe many webmasters were getting peanuts. It has brought alot of money to small time publishers so although you can disagree with what they are doing, do remember the fact Google has done webmasters many favors.
The problem is as i see it is the sheer domination by group projects, before 1,2 or small teams of people were producing content many years ago. When you have people that have ideas multiplied by ten to hundred times you can certainly see abuse and it is a clear abuse. Question is i always find interesting is how can 1 site be ranking for every type of topic like these sites do. If your an everything site you should be considered not as important as a niche.
Going back to my first point, its not bullying, the fact remains you dont have to rely on Google for traffic, this was always going to be a highly volatile business.
|It has brought alot of money to small time publishers so although you can disagree with what they are doing, do remember the fact Google has done webmasters many favors. |
Google made many webmasters fat and lazy, plumped us right up for the holiday dinner. Google gave us enough rope to hang ourselves.
The part that roasts my rump is Google itself is aggressive with ads.
By the nature and intent of this update, shouldn't they just go offline?
Google presented me with a SERP just now that featured more paid-for results (11) than it did organic results (10), as well as a big, ugly 'Places for x near y' suggestion list above the organic results.
The top third of the results page consisted of ads and suggestions. I had to scroll down quite a distance before arriving at the organic results.
I don't treat my users with anywhere near that level of contempt, yet Google's new algorithm saw fit to penalise me such that my traffic has fallen by 26%!
Google, I don't like you anymore. Innocent livelihoods are being ruined by this abortive update. You could at least have the decency to roll it back while you work on a remedy.
[edited by: Richie0x at 2:15 pm (utc) on Mar 4, 2011]
oh, and they're not done yet...