Forum Moderators: open
Has anyone done this and encountered problems?
All the PR papers say more pages = more PageRank (since every page gets a tiny bit of PR to start with). If this is so, why don't we see sites filled with well-linked random text dumps? How does Google guard against this?
Say his 500 pages were mostly "PR6" and he was hitting #1 for some terms that bring in good traffic but unfocused.
There may be some niche terms that need less page rank to hit #1....i.e. he could chop up his 500 pages into more specific terms, and still hit #1, but with a better "quality" of traffic from the 4000 more focused pages.
That's how Brett's theme pyramid would seem to work IMO.
There is probably an "ideal" number of pages that would have an "ideal" amount of pagerank to hit #1 from its PR and the keywords/phrases it targets and the on the page factors that are taken into account.
Perhaps it would only take a PR5 and standard on the page stuff to hit number 1 for "widgets" but it would be better to split the PR between two pages and focus on "fuzzy" and "red" widgets sort of thing.
That's how I think it works anyway :) I wouldn't be too sure if multiplying the amount of pages he has by 8 in this instance would help him PR or otherwise.........
I disagree. Adding pages to your site does boost site pagerank. If all pages in the internet started out at zero there would be no pagerank on any page in the whole internet.
Google needs to give a basic score to pages in its index to get things started, and then the pagerank is redistributed to more deserving pages.
This means more pages = more pagerank, even if they are internal to your own site.
Although internal linkage helps transfer PageRank around it is very unlikely that any single page will make PageRank out of nothing.
Sites that have limited PageRank to start with, shows the drain down the link hierarchy quite effectively, where many are PR0 even though mainpage is PR1, 2, 3.
The quality of the link is more important than the number of links and quality comes from outside your site.
Amazon is a very good example of this, millions of internal, interlinked pages and 24 million inbound links (50,000 as Google backlinks) and on theme content, lots good link anchor texts and the best PageRank they achieve is PR4 and most are 3 and below.
You would think they would be an easy PR10, would you?
Although adding *some* more pages and linking them optimally will increase the pagerank of the targeted pages, Google applies a limit on the number of internal linking pages that will be calculated.
This threshold was dramatically lowered in the last Google algo change - your internal pages count less. This change was sensible because it's an easy and obvious manipulation, and a lot of people were doing it.
Also, be aware that it's not just about PageRank, other link factors are more significant than PR score.
I manage several sites with tens of thousands of pages, some with hundreds of thousands of pages, with various carefully crafted linking strategies. The internal links don't take you that far, and count nothing for the HITS and similar algos used by most other major SEs.
Concentrate on getting other quality sites to link to you from a prominent page (lots of legit page views) with good descriptive text.
In addition to getting more legit traffic from these referrers, it's the best SEO you can do these days, and it won't go out of style any time soon (even if it does as part of ranking algos, your effort is still worthwhile).
This is not just the usual "build a good site and your ranking will take care of itself' glad-handing (I don't believe that hooey. Build a good site and then use legitimate SEO techniques).
I've verified my theory on the limits of internal links with a PR calculating spider I wrote, cross-referencing the results with actual rankings. I have also conducted numerous rigorous tests "in the wild" with sites using various configurations.
Go for high-quality external links.
It seems like there is a sort of cliff that you fall off of. A 100,000 page site gets little extra benefit from those internal links, but a 1000 page site gets a lot of benefit compared to a 100 page one.
Steveb, do you have a current example for which 1000 internal pages give a significant boost in ranking?
My reference sites show internal pages hitting a ceiling at about 500, although it used to be much higher (thousands).
Sticky me if the reference would be disallowed. I'll run an analysis on the site and any ranking above and a few below and send you the results.
I have sites for which google has spidered tens of thousands of pages (verified by logs, and the presence of these pages in the index) yet get no increase in rankings (again, these are linked with verified good linking techniques).
Note that current results are different than previous versions of Google algo.
The "pruning" may be done with PR rather than a fixed number, but it would be very hard to generate thousands of high-PR pages without drawing unwanted attention to yourself.
I have triggered google penalties for sites in the past, so I can tell you, don't experiment with "inflating" ranking using internal (or incestuous) links on an important site. There are few faster ways to get excommunicated (except maybe, uh, selling pagerank? )
After months of self-flaggelation, sack cloth and ashes, and lighting a constellation of votive candles, the prodigal son returned.
The famous (infamous?) talk show host Dr. Laura S. says "do something noble."
That works wonders for getting links, as well as accolades.
It seems like there is a sort of cliff that you fall off of. A 100,000 page site gets little extra benefit from those internal links, but a 1000 page site gets a lot of benefit compared to a 100 page one.
Cool! I had 18 pages make the September update, and I had over 1500 different pages crawled for October. Now I'm sure to get that PR10 by Christmas! I will have such high ranking that I will be the irrelevant page that every one complains about in the SERPs! Bwahahahahaha!
Then again, it might only help me get up to PR6, which would still be cool. At least I'm getting deep crawled now, which is all I was really hoping for.
You probably are right. I just waved my hands to come up with 1000.
My specific experience though is with four sites that rank above me in the results where I have virtually all the same meaningful external links they do, plus some more they don't (good ones too). But they have more pages... and two to 15 times the internal links I do. (If you look at my profile and search for the obvious keyword you'll know what I'm talking about.)
Couldn't your observation simply be expalined by the logorithmic propery of pagerank?
There are a couple of reasons why additional pages may not help your pagerank much:
1) If you put them in lower level subdirectories they lose the automatic pagerank boost Google gives pages in higher level subdirectories. This effect is very visible in ODP.
2) If your site is pagerank 6, assuming a log 10 scale for pagerank, you would need 100 internal pagerank 5 links or 1000 pagerank 4 links or 10,000 pagerank 3 links to the home page to gain pagerank 7. This assumes that the link to the homepage is the only link on each page, which is clearly ridiculous. It's more likely you would need 10 times these numbers to see a discernable effect.
90% of everything people here "know" about page rank is total speculation. And much of it is even down right wrong, if you are to read GoogleGuy's comments.
Here are tha facts. A PR of a higher number is better than a PR of lower number. PR is gained by having incoming links. The PR of the page linking to you will influence the how important that link is to you.
That is it.
The logrithmic scale is a theory. The division of PR going to linked pages is a theory. Whether PR is calculated or guessed on a page is a theory. Each page has a base amount of PR to share out is only a theory. What point in the process your PR for a given month is calculated (or guessed) and which month's PR is used to vote for your site is all just speculation.
Now I have varying amounts of respect for various theories. And I have a lot of respect for some of the people who have come up with some of them. My problem is with the people who start quoting theories as fact.
Every one of the above theories, and many of the others, *might* be correct. But I could easily come up with a few options for each that would provide similarly accurate, or possibly more accurate, results.
Even if you test these things, there is still only one person on these lists that has the possibility to "know" and the odds are pretty good that GoogleGuy is not the owner of the PR code, since he seems much more familiar with content issues, so I even doubt that he "knows" exactly what goes into it.
You can certainly do well by making some assumptions that work out well for you. But there is a huge difference between "it works" and "it works, therefore my assumptions are correct."
Question the "common knowledge" and try to come up with alternitives and you might just be surprised with what you come up with.
Here is an easy one for you. I have seen the assertion a few times that PR is not really an integer between 0 and 10, but it is infinite in that range. There is definitely a difference between how much green is showing on a couple of PR9 sites, so I would tend to agree that it is not an integer in that range. But I do doubt very much that it is a floating point number, or even close to infinite. There is absolutely no reason for that sort of accuracy and to waste that much processing time. Dealing with infinites will cause you all sorts of problems when you do not need the accuracy.
You really aren't the worst offender, and certainly some of of the proponents of some of the more questionable theories go around proclaiming them as fact. At least your post was putting some thought into it.
When I said
They must get a lot of entertainment at googleplex reading these threads where people try and figure out exactly how PR is calculated.
I really did not mean to aim it just at you, but that I really do think they get some good laughs over all the bickering, squabling, posturing and postulating that goes on over such a small portion of the algo.
Think about it. If you worked at Google and you knew how it actually worked, wouldn't you occasionally enjoy sitting there, reading these threads, smuggly giggling to yourself when someone is certain they have it right? Then if they were actually getting close you could change one value ever so slightly to screw them up?
Amazon is a very good example of this, millions of internal, interlinked pages and 24 million inbound links (50,000 as Google backlinks) and on theme content, lots good link anchor texts and the best PageRank they achieve is PR4 and most are 3 and below.
You do realize that amazon uses a bunch of redirects, don't you?
There are a couple of reasons why additional pages may not help your pagerank much:1) If you put them in lower level subdirectories they lose the automatic pagerank boost Google gives pages in higher level subdirectories. This effect is very visible in ODP.
SOD - You're right about the dissipation of PR to pages further down the link chain, but I'd say that this is due to the splitting of the PageRank "vote" as you link to multiple pages downstream rather than to the directory structure. The pages could all be in the same directory, and you'd still observe the same dissipation.
A couple of threads on subdirectories and subdomains that relate to this...
[webmasterworld.com...]
[webmasterworld.com...]
I think fathom, WG, and iconoclast have it nailed about internal pages boosting the site. That would be a little like pulling yourself up by your own bootstraps. You need external incoming links.
False dichotomy. Needing incoming links doesn't reflect at all on the value of more internal links. And the bootstraps example is a good one, if unintentional. It doesn't matter how much "good stuff" (external links) is thrown at you if you don't use your own personal resources wisely (more, sensible internal links raise the value of the site as a whole). A rising tide lifts all boats. If the internal linking is good, more pages means more page rank until the precipitous dropoff when more pages mean nothing.
I did find a site in Google that has two inbound links (1 was PR4, shared by 8 other outbound links, the other link PR1) 384 web pages to the site and most have a link back to the home page.
This sites main page is PR1 and the 20 or so internal pages I checked were PR0. (the PR power of internal links is nothing compared to external links).
Although I do believe each page has some small amount of PR (because of the internal "VOTE") if you are the only one "VOTING" for pages I really can't believe Google will ever deem you as an authority on any topic.
Even with the most exceptional internal link structure your vote to yourself matters little.
[edited by: fathom at 8:42 am (utc) on Oct. 13, 2002]
As steveb says, you can pull yourself up that little bit higher than your "natural pagerank".
SOD - You're right about the dissipation of PR to pages further down the link chain, but I'd say that this is due to the splitting of the PageRank "vote" as you link to multiple pages downstream rather than to the directory structure. The pages could all be in the same directory, and you'd still observe the same dissipation.
Yes and no. What if a page is indexed which is not linked to the rest of the site? I have several pages standing in their own right. They are not linked to from the site, but they do link back. Still, Google gives them pagerank 5 because the index page is a 6. They have no incoming links at all.
However, the link back to the homepage, IMO will not increase the homepage PR anymore than where it is at, so that the other page can get a little bit more PR, so that the homepage can increase a wee little bit, so that the other can do the same.
If a page has no "natural" PageRank of it's own it has nothing to pass back.
My point is however, mere spectulation, but the above does not make sense.
SlyOldDog, you have a good point, which is what I alluded to when I said the "pruning" may be done with PR rather than a fixed number.
In either case, my theory, which can be easily proven by careful (if laborious) testing, is useful, whether or not the pruning is done with a fixed number or PR. I've already mentioned that manufacturing thousands of high-PR pages is difficult, and could have undesired consequences.
Sasquatch, although I agree that there is quite a bit of misunderstanding about pagerank (and more importantly, Google's ranking algo) on this board, I believe that the point of the forum is to share ideas, theories, and in rare cases, hard-won knowledge.
Pagerank can be quite easily understood by reading the many papers Page and Brin wrote while at Stanford, and by reading the original Stanford Patent application.
An actual page's pagerank is a decimal number between 0-1. The pagerank sum of all pages on the Internet is 1.
What remains is to apply that authoritative information to the current state of affairs on Google, and demonstrate your competence through your ability to rank well consistently on competive keywords.
Got any theories?
Got any theories?
Eliminate link pages. A links page depreciates link value.
Outbound links should be part of your site/page content on the most appropriate page, and where possible every inbound link should be reciprocated without the normal feedback loop.
That is: your outgoing "Link B" points to the page that has the incoming "Link A", so that the PR transferred to "Link A" from "Link B" is apprecitated and returned to "Link B" now with a higher value than it had before.
In this case, many more web pages of your site have "natural PR" and not just PR from internally transferred PR with a dilution factor.
The more pages with "natural PR" provides more points of internal PR transfer.
IMO hoarding PR (e.g. using JavaScript) may be bad for those that link to you, but using link pages is worst (on both) as you are not maximizes PR transfers across your site (and their site) that can induces higher SERPs all over. If that main page (with all the PR) is too far away, many site pages are not as high in SERPs as they could be.