Welcome to WebmasterWorld Guest from 188.8.131.52
In my view yes, PR has a direct influence on ranking for a given search phrase but the direct influence is very small indeed, it is not worth worrying about, it is not important. (Just to avoid confusion, I'm using "important" as definted in Cambridge/American Heritage/etc.)
There are secondary factors (e.g. people like to exchange links with high PR pages/sites) and there are related factors (e.g. sites that have high PageRank tend to have plenty of links from different sites with supporting anchor text).
Also, PageRank is useful for other things (e.g. crawl depth and frequency) but the direct influence on rankings of having more PageRank is, as Macro put it, "small enough a consideration to be ignored".
Intersting post. However i dont think PR in the google directory has anything to do with the websites content or position anymore.
Currently our own site is a PR5 with loads of PR5 pages on it and PR4s etc. The site is one of 71 in its category within the google directory about half way down in relation to PR compared to others.
Logic would dictate that the site should be somewhere in the google index yet it doest feature anywhere. Well unless you count position 380ish for certain keywords
Meanwhile its 1 in MSN for almost all related keywords and about 8 in Yahoo depending on the keyword.
Conclusion. PR counts for nothing at all and the sites in the google directory are not one bit important to google because if they were you would think they would list under the category keyword
There is definitely (as proven once again by Allegra) a sort of site rank that is site-wide and not just page specific like PR. I think site rank factors can include everything from IP ranges, bad-neighborhoods, high-occurrence of dupe content, whois record history, as well as manual penalties. This can result in everything from de-indexing, sandboxing, and general lower rankings in the SERPs for your domain.
I think that there is not enough recognition given to the fact that Google is very good at categorizing web sites and web pages based on content, and this is a factor that is outside of PR. I think this started out as Bayesian categorizations used for search results and because their method worked so well, it turned into AdSense. Now I think it has come full-circle and is being used to detect website SPAM. I also see evidence that G uses a sort of Q-value, or a breadth-of-topic type of rating that indicates how narrow of a topic a site covers, if it covers multiple topics, or if it has no specific topic at all.
The end result, as I see it, is that SEO includes so many factors that it becomes more of an organic process instead of a formula, and the organic process is getting closer and closer to the steps necessary to create a good, user-centric web site. And that's what Google has said they want from us webmasters since the beginning.
The only way to answer this question is creating pages for this purpose, i.e. pages with identical on-page factors as well as the identical anchor text (and other off-page factors) but different PR. (By different PR I mean different real PR coming from the linking structure used.)
The reason is the correlation between PR and the number of incoming links (as already mentioned by ciml and claus). This correlation leads to a correlation between PR and ranking even if PR isnít used in the ranking algorithm. This can be seen (for example) by looking at the SERPs of a search engine which doesnít use PR or a similar system but anchor text. One will find even for these pages a correlation between PR and SERPs although PR isnít a factor.
Of course, this doesnít mean that PR doesnít affect ranking but shows that drawing conclusions just from examining the SERPs for special keywords isnít helpful.
Drawing conclusion from the fact that PR0 pages beats pages with higher PR or that high PR pages are buried under low PR pages is also not helpful. PR0 page might be new and handled differently. Also the toolbar might not show the correct value. There can be a delay or even wrong data.
One of my pages is showing a PR6 for month although itís a PR0 Ė as can be seen for example from the directory.
Finally, there are even more complex ways that PR is influencing the SERPs without being a direct factor. For example, one could weight the anchor text with the transferred PR (as already mentioned by egomaniac). Distinguishing such a system from an anchor based or PR based one would need even more detailed analysis but can be done in a similar way as mentioned at the beginning.
.. from creeping data corruption. It's almost inevitable from such large computational projects.
This shouldnít be a problem because PR calculation isnít a dynamic (complex) system (-> Liapunov exponent). Therefore, yes you can beat the error creep!
Well, you could always model it mathematically by fitting a straight line with PR on one axis and rank/position on another, ie. regression. That would give statistical evidence.
As explained above, this analysis didnít answer the cimlís question!
You can easily find correlations between factors which are not correlated when both depend on a third parameter. For example, you will get a correlation between ABC and XYZ even if they are not correlated when both are time correlated.
A follow up thread here:
Google Search Engine Optimization 101 My list. (Aug 19, 2003) [webmasterworld.com]
>> Bell curve
Kudos Grelmar, that's a really nice way of thinking about this stuff. You might want to use the Poisson distribution as a thought model in stead of the Normal distribution, as that way you wouldn't have to consider the difference between lower/higher levels of B,C,D... IOW, the bell curve doesn't lend itself to an intuitive understanding about the relative importance of low/high levels as they're on each side of the mean. While the Normal distribution is the mirror of your traditional S-curve the Poisson distribution is closer to what you're actually trying to communicate, and there's a clear hierarchy between the lower and higher levels. (not sure if i've made myself clear here *lol*)
>> you get almost the feeling that Google is giving out an "estimated" PR
I fully agree that the Toolbar PR is an estimate, just like the total page count is.
>> There is definitely (as proven once again by Allegra) a sort of site rank that is site-wide and not just page specific like PR
IMHO, you might like to go beyond the concept of "a site" and look at "neighborhood" in stead. This particular system of two weighed rankings (traditional "real" PR being the base rank) is called LocalRank.
>> I also see evidence that G uses a sort of Q-value, or a breadth-of-topic type of rating that
>> indicates how narrow of a topic a site covers, if it covers multiple topics, or if it has no
>> specific topic at all.
dataguy, that sounds very interesting, both as a concept and as observation. Would you care to elaborate a bit about that?
>> You can easily find correlations between factors which are not correlated
>> when both depend on a third parameter
You're 100% right. I suggested the method because the exact question was "importance of PR relative to ranking" and the way to figure that out is to test the two against each other. If both depend heavily on a third parameter you will see a high correlation but this will be a false image.
Didn't Brett (or somebody), a long time ago, take a crack at putting together a list of factors that affect rankings? Maybe even ordered by importance? Couldn't find it, if it even exists.
Here's a discussion of Brett's list, updated, with opinions of the importance of various factors....
Brett's quick rank (good)
Here's ciml's list, with excellent discussion and elaboration... one of the classic threads in the Supporters Forum...
Google Ranking Parameters
What are the factors available to Google in ranking Web pages?
While the Normal distribution is the mirror of your traditional S-curve the Poisson distribution is closer to what you're actually trying to communicate, and there's a clear hierarchy between the lower and higher levels.
I had to read that whole paragraph twice, but yes, I think I get your meaning.
After re-reading my own post yesterday, I also started to try and think of clearer ways to graph out the relative importance of various factors, given that we're all just really taking pot-shots at what we think are the key factors.
I'm assuming the Google Algo is probably more complex than any 1 of us here could reverse engineer, and that any simplified system would invariably be somewhat misleading,
I'm starting to think along the lines of a logarythmic diminishing returns curve (as opposed to a straight line diminishing return graph) as a way of providing an effective time use (from a webmasters perspective) vs. the importance of factors in the Google Algo.
At the high end, you have those factors which seem to make the biggest difference in your placement, and where you should be spending your time concentrating on.
The further down the curve you go, the more you head into the nebulous areas that are hard to figure out exactly what they are, have much smaller effect on placement, and therefore you should spend the least amount of time designing for.
If it was accurate, I don't think you'd end up with a smooth curve, but one with several plateaus on a descending line....
Hmm, I could spend a week doing nothing but messing with the concept, then another week formalizing it. But it wouldn't be all that valuable to devote the time to. As a one-man-band, in the end, I just need to know where I should be spending my time, and the original curve seems to be working Ok for now (it was basically the approach i'd internalized a while back, and only formalized the thought process basically as I was writing that post).
I think it would be a valuable excercise if I (or anyone else, for that matter), could come up with a chart or graph that was visually simplistic (so you don't have to be a pure math major to get it - even a Newbie should be able to grasp it), and it would be easy to shift the factors up and down the scale as new evidence arises and also as Google itself tweaks what they're doing.
Might be a fun part-time project over the coming months.
The site www.yahoo.com has a PR=10. With a keyword that has a good correlation with the website content (let say "yahoo"), the website gets #1 in serp. With a keyword that is on the home page but has a poor relation with the website content (let say "copyright"), the yahoo website gets position #24 (with the copyright page, related to the keyword). Try a keyword that is on yahoo home page but has no relation to content (ex: rights), yahoo.com does not get any position in serp.
My conclusion: pagerank alone is useless. Pagerank related to content is king.
If a competitor website gets a better position in serp with a lower PR, you should ask yourself why Google thinks your competitor website is more a better answer for Internet user.
Yo. I wish I had your ability to formulate things so precisely. So first of all, an I mean that honest, a big thanks and compliment to google's programmers. Similar :
> I'm assuming the Google Algo is probably more complex than any 1 of us here could reverse engineer, and that any simplified system would invariably be somewhat misleading,
> HOWEVER ...
Yes, a big however! should we, as a consequence
a) give up SEO as a science and pray Larry and Sergey will stick to their high moral approach as long as possible, or
b) expect and prepare for the times when monopolist power'll corrupt them
I also like your idea of a sort of logarithmic scale of relevance of factors as an elaboration of the list calum had compiled last year in that thread cited by robert charlton, but:
None of us can really judge about the impact of page-rank because none of us has access to that data. Googles founders have proven that mathematical abstractions over the internet-link-structure bear unexpected means of informational power and mark a qualitative step in internet-development. But the physical power you need to even mirror that data is immense, put aside bandwidth, crawling and calculation.
The very personal conclusion I draw from cimls initial question is that i have become aware of how few things i really do know about pagerank. I read about the formula, I even understood it roughly, I might even simulate it by a self-written algo, but thats all. And i have become aware of how important this not-knowing is.
Without discounting the intelligent discussion here presumed to be based on experience with real sites in competitive industries, how else can one explain low-PR sites which consistently outperform higher PR sites across search phrases/keyword sets? (say, over 8 months + so far in my experience).
Few inbound links, tPR 3 or less in site, and dozens of top 5 positions for dozens of popular search phrases/kw. PR appears relatively irrelevant for these sites (I have a half dozen of them). Sure there are strategic factors in play, unmentioned here, but none of those commonly cited for earning PR. In one case I compete with a tPR7 site optimized for specific terms, with tons of backlinks. I estimate I have beaten her to #1 half the time over the past year.
Interesting thread based on ciml's initial post, but has it devolved into a debate of the same old PR factors and their likely influences? I agree PR *can* be irrelevant. I can't accept otherwise given my direct experience.
It is possible to isolate variables in a complex system. However, when doing so it is important to take into account that a discovered relationship under one set of conditions might not apply under another set of conditions.
So, while I stick to my statement that PageRank is not an important direct ranking factor in Google, it can be a very powerful tool for understanding aspects how Google works.
Note: As in the first post in this thread, I'm using the word "important" as definted in Cambridge/American Heritage/etc.