I look forward to this every year. To be honest I don't really understand all the correlation stuff but the "influence" charts and the comments are always interesting to read if only to make me think more about things I had never considered. Like this:
The one comment that stands out so far:
|For domains expiring within the year, the crawl and index rate dramatically drops off. |
Anyone else share that view?
Correlation works like this. If you find a doctor, there will be lots of sick people around them. Doctors are correlated with sick people. But you can't conclude that doctors cause sick pepple. That's causation.
They also have to do some math on this because with correlation you can be measuring the wrong thing. For example, they may be measuring social signals, when in fact the real signal is backlinks, and social signals are showing strong in some cases because of strong backlinks (not saying that's the case, merely an example). And correlation can be strong or weak, or inverse.
The list would seem to change every year. Chase it this year, what are you going to do next year?
Second post because this is a distinct point.
You think your ranking involves adding up numbers that total 100%? Pie chart analogies are extremely misleading for this stuff.
Agreed but what I find interesting is that it pools opinions in one place that's easy to follow. We all know that (most) SEO is guesswork, albeit sometimes educated guesswork. We also know a lot of it is common sense.
So I find the aggregated opinions interesting reading and often inspirational in that they make me re-evaluate how I work. That doesn't mean I make changes but at least it makes me consider things I hadn't previously given much thought to.
PS. Thanks for the correlation example although I'm still not sure how useful it is in this context.
|"For domains expiring within the year, the crawl and index rate dramatically drops off." |
Anyone else share that view?
The data may be accurate, but that doesn't mean it identifies a direct ranking or even crawling factor. A good example of correlation not being causation.
Many domains that are being allowed to expire are therefore not being updated or actively marketed. That inactivity would CAUSE a drop in indexing and crawling, not the expiration date itself. Renew a domain name but continue to neglect it - the new expires date won't help indexing or crawling.
I took a quick look and then gave up rapidly, grey text on light grey backgrounds in micro sized fonts..on pastel pages..their designer appears to be suffering from conjunctivitis and cant stand contrast ..result, the pages are barely readable if you are over 45..
Usability fail !..
Several of the questions were apparently not understood by the panelists - and the wording was quite inscrutable at times. Rand even made note of several likely misunderstandings during his presentation.
Another flaw I noticed with this year's study was that some questions were of the "when did you stop beating your wife" variety: hidden assumptions that you had to accept if you were going to choose one of the multiple choice answers.
Because of those questions, the results sound like the panelists agree that these elements are indeed ranking factors (domain expiration, for example.) Now that I've seen the final presentation, I realize that those questions were included to provide a platform for the SEOmoz correlation data. In other words, the correlation data was already completed, and the questionnaire then set up those topics as potential causative ranking factors.
|PS. Thanks for the correlation example although I'm still not sure how useful it is in this context. |
The backlinks example I gave was very illustrative. Say backlinks are a strong ranking factor, and social media signals are not used at all. If you measured social media signals, you might conclude that they were important in ranking. Yet all the strong social signals in the world won't help a bit with ranking under those circumstances.
They're picking stuff that they 'think' should matter with ranking. I'm assuming SEOmoz and probably others are using stats to measure these ranking signals. It's fairly basic statistics to grab a bunch of signals and measure them statistically to see if they matter.
I think the problem though is likely that it's pretty near impossible to measure some signals - how do we measure tweets without access to the data?
And if we miss some signals, we might end up with a poorly fitting graph. And a poor fitting graph simply doesn't give us a whole lot of useable information. Too much noise in the signal.
Which is why, as we all know, it's relatively pointless to try to deconstruct Google's algo other than in terms of general principles. if you think domain registration expiration matters, go register for 10 years and then forget it. I doubt it's some big secret sauce though.
Matt Cutts said that the only way they might use domain registration time is as a reinforcing signal - in play only if the page already smells like spam. It's not a direct, positive factor.
[edited by: tedster at 10:00 pm (utc) on Jun 11, 2011]
Spearman's rank correlation - remember him well - brings back memories of working with a really clever professor at the LSE , the only guy who got me to understand multi-dimensional variate analysis.
Wheel is correct. There are significant flaws in assumming cause/effect from data, even trying to guess these effects without null hypothesis analysis is walking on dangerous ground ( if my memory serves me right )
Spearman's Rank correlation itself has flaws and should only be used as a broad indicator ( I forget the theory so don't ask, something to do with confidence levels )
Back to my analysis to prove google is throttling our sites...
|Back to my analysis to prove google is throttling our sites... |
Depends how much machine and how much human we have in the algo. If Panda was told here are 5000 good sites and 5000 bad ones, now go rank the rest, I'm not sure if correlation /causation matters. The algo looks for alike signals and that's it, you're dumped in the basket with the sites that send the same signals as you.
How much human editing they had in it the first time, second, third time ...will have in the months ahead we don't know. But what was Ok yesterday, might not be today if spammers adopt it. Just ask a cop about cars about D.A.R.E. stickers :)
If you are driving a nice car in a so-so neighborhood, tinted windows, a D.A.R.E sticker, loud music and girls in the car you could be a drug dealer or a teenager with a rich dad. The cop will almost certainly stop you and determine the truth, but Google doesn't have that luxury due to the size of the web. Sometimes they might stop and /or arrest all of them and clear the innocent later.
I guess they are signals you can send to say you're legit but even the British Medical Journal was apparently caught by Panda so who knows.
I found this gem buried in the comments:
|Todd Malicoat |
Through all this analysis of search optimization - we will always conclude that a site needs "more links, more quality links, more content, and higher quality content"
That pretty much sums it up. I've been thinking about this for a while -- I follow the latest SEO news, analyze the latest algo changes etc, but in the end I always end up coming back to the same basic conclusion. More and better content and links has always been good for SEO and probably always will be, and any improvements in this respect will probably continue to benefit you regardless of the details of algo changes.
I think it helps to return to this thought when you get too lost in the weeds or you get distracted with chasing the algo. The odds are that whatever you're doing, it cannot possibly benefit you as much as adding more and better content and links.
|we will always conclude that a site needs "more links, more quality links, more content, and higher quality content" |
The Wal-Mart principle of www site design: collect all content in one convenient location, so all searches will lead to the same page.
|collect all content in one convenient location, so all searches will lead to the same page |
I don't think that's what he meant. He didn't say "page" he said "site." You can have lots of quality content and still have each individual page be relatively small and tightly focussed.
"Remember - correlation is not causation!"---but in terms of social media signals where SEOmoz show's Facebook data having +(positive) correlation which reflects on causation effects..so a bit contradictory statement overall!