Matt Cutts on PageRank Changes

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Matt Cutts on PageRank Changes

pageoneresults

5:13 am on Jun 16, 2009 (gmt 0)

PageRank Sculpting
[mattcutts.com...]

So what happens when you have a page with “ten PageRank points” and ten outgoing links, and five of those links are nofollowed? Let’s leave aside the decay factor to focus on the core part of the question. Originally, the five links without nofollow would have flowed two points of PageRank each (in essence, the nofollowed links didn’t count toward the denominator when dividing PageRank by the outdegree of the page). More than a year ago, Google changed how the PageRank flows so that the five links without nofollow would flow one point of PageRank each.

steveb

12:26 am on Jun 17, 2009 (gmt 0)

"What if, by nofollowing the home link, you are now throwing away half of the PR..."

You already are. The evidence has always been that double (or triple or octuple...) links don't pass extra PR. The PR evaporates.

---
(Of course, as usual, Google may experiment with a dozen different ways to handle something like this.)

CosmicLee

1:31 am on Jun 17, 2009 (gmt 0)

Nice to have Matt come completely clean on this one.

freejung

4:26 am on Jun 17, 2009 (gmt 0)

The evidence has always been that double (or triple or octuple...) links don't pass extra PR. The PR evaporates.

By that I assume you mean they count as one link for purposes of passing PR, but they count as two links for purposes of distributing PR among links, thus destroying one link's worth of PR?

Odd, seems like it would make much more sense to consolidate the duplicate links to one virtual link, as they claim to do with canonicalization of pages. Then duplicate links would neither pass additional PR, nor would PR evaporate - they would just count as a single link. I would hope they would experiment with doing it that way.

I'm not disputing your evidence, I expect you're right but it's a weird thing for Google to do. In a sense it penalizes multiple internal navigational links, when there are plenty of strong reasons to do that for usability.

tedster

4:36 am on Jun 17, 2009 (gmt 0)

You can be 100% certain the math involved is more complex than basic arithmetic, just division and so on. To get beyond a simplistic approach (which will mislead you if you take it too far) you will need to start studying vector calculus and a lot of academic Information Retreival papers.

<added>
freejung - your earlier posts indicate that you already appreciate this. But I thought it bore some emphasis here to warn people about getting drastic with their websites based on a very simplified model of what Google is doing. We need to simplify no matter what, because some of the essential information is never shared.

[edited by: tedster at 5:09 am (utc) on June 17, 2009]

buckworks

4:52 am on Jun 17, 2009 (gmt 0)

start studying vector calculus

Ugh! ;)

I find it a lot simpler to take the approach, "What SHOULD sensible search engines be trying to value?" That's what I try to aim for, and understanding any math comes later.

The way I think about Google is almost anthropomorphic sometimes.

freejung

7:00 am on Jun 17, 2009 (gmt 0)

the math involved is more complex than basic arithmetic... you will need to start studying vector calculus

Well, for finite discrete systems such as this one, that's pretty much just arithmetic taken to extremes, but I get your point and agree of course. Anything we say here is a massive oversimplification and has to be, because even if you understand the math, there are crucial details Google isn't going to tell you. The theory still gives search engines lots of choices, and we don't know exactly which ones Google makes. We should post that as a standard disclaimer above any discussion of algorithms.

freejung - your earlier posts indicate that you already appreciate this.

Thanks, tedster, it's kind of you to say that. It's been a while since I used vector calculus in any serious way, but I used to be pretty good at it. However, that doesn't actually help much in this case, for the reasons stated above. I think buckworks' approach makes a lot more sense than trying to reverse-engineer the actual algo. Reading the papers and understanding the math may sound daunting, but nonetheless if it were that easy, someone would have "solved" SEO by now.

warn people about getting drastic with their websites

That is very good advice. I for one have not touched my site since this nofollow thing started. Now I'm thinking about removing nofollow from all internal links, but I haven't done it yet.

Anyway, setting aside the standard disclaimers and caveats, here's a simple question that it ought to be possible to give a straight answer to:

Do you believe that optimal SEO design is to have only one internal link per page to each other page?

Because that is what steveb's comment seems to imply. It seems a bit odd to me, and I was hoping he would clarify.

tedster

7:26 am on Jun 17, 2009 (gmt 0)

I don't know if that's exactly what steveb is saying, but I never built a site structured that. However, I have helped a lot of sites improve their search traffic by minimizing the number of links on a page (most especially the Home Page) and creating a well structured information architecture -- creating silos for various themes and being VERY stingy with cross-links between pages that are deep down inside different silos.

I prefer a home page to have in the area of 40-50 links max, with only 6 link or fewer in the main menu - which I try to make the first links in the source code.

There's a tendency for many webmasters to want to "link to everything" from the Home Page. Maybe they are thinking about PR flow (the supplemental index frenzy a while back really fed this craze), or maybe they bought into the now disproven "3-click rule". Whatever their motivation, they put too many links, and too much anchor text, right at the domain root.

I've seen sites with over 1,000 internal links on the Home Page - and I don't mean amateur sites, either, I mean big corporation, big money sites. Think multi-level hover menus and you get the picture.

The first SEO problem here is that anchor text is not just a backlink factor for the link's target page, it's also an on-page factor. Talk about overwhelming a semantic algorithm with too much information!

A healthy Information Architecture, with well-chosen menu labels (anchor text) is the root of good SEO and also the root of good "pagerank sculpting." There is a relatively famous thread from our early days (20010 in 2001 entitled "Themes [webmasterworld.com]". Some of the links there are now broken, but what's in the thread is still quite important.

There are also some 2004 threads in the HTML Forum about Information Architecture [webmasterworld.com] where I put out some ideas and some very good discussion follows.

I recently consulted on a site redesign. We spent 6 months on the Information Architecture and about 4 months building out the content and the code - which became rather easy after the IA ground work was complete. The site is now about 3 months past launch and exceeding expectations. It's also proven easily extensible because extensibility was baked into the recipe.

And yes, it uses just a touch of nofollow and iframe PR sculpting - mostly to encourage Google to find truly useful Sitelinks.

francie brady

10:17 am on Jun 17, 2009 (gmt 0)

If I have a well designed site architecture utilizing themed silos with little cross linking (especially deep into a silo), am I ok:
- NOT no following nav bar links that are loaded via iframes?
- NOT no following nav bar links that are already directed to be nofollowed in robots.txt?

I prefer to use on-page links with appropriate anchor text to connect to other pages within my site rather than relying on the nav bars (right now I have the nav bars loaded via iframe, and have the nav bar links no-follow via robots.txt AND no follow on the iframe pages as well).

Ack! I'm getting rel=nofollow "a bit dizzy" :-)

alahamdan

12:01 pm on Jun 17, 2009 (gmt 0)

"We don’t follow them. This means that Google does not transfer PageRank or anchor text across these links"

Ok what about crawling, do you guys know if the bot will still crawl those links with rel=nofollow?

Hissingsid

12:50 pm on Jun 17, 2009 (gmt 0)

With all of the frenzy about PR I think it is important to remember that PR is only one factor in the algorithm and that there are other factors that apply specifically to links.

The anchor text in a link does not only benefit the page that is linked to but also the page that the link is on. So don't be too quick to remove links on your home (or other pages) the anchor text in a those links, if pointed to a "relevant" page will be helping the page with the link on it to rank for the term in anchor text. Sometimes this can have a very significant effect.

Cheers

Sid

francie brady

1:04 pm on Jun 17, 2009 (gmt 0)

"Ok what about crawling, do you guys know if the bot will still crawl those links with rel=nofollow?"
-- They Do still get crawled - this I know :-)

freejung

3:05 pm on Jun 17, 2009 (gmt 0)

minimizing the number of links on a page (most especially the Home Page) and creating a well structured information architecture -- creating silos for various themes and being VERY stingy with cross-links

Sure, that's basic SEO and some of the best advice you could possibly give. I'm just about to do an architectural redesign myself, and I'll definitely review the material you suggest before I do it!

Still, there are often good reasons to have duplicate links - for example, an article might crosslink to other articles in the same silo within the body text as well as in nav. Steveb seemed to be saying that would be non-optimal.

Andem

6:33 pm on Jun 17, 2009 (gmt 0)

I imagine our site started using nofollow to kind of sculpt pagerank flow within our site around 2 years ago. I believe it was 1.5 years ago when I noticed a drop in traffic because on each and every single page, the TOS/Privacy Policy/Contact Us/About Us links were all nofollowed. If you consider that those 4 links effectively drained PageRank from our entire site from every page, it's no wonder.

With that said, I'm wondering if Wikipedia will take another look at the way they do things. A lot of our pages (I mean a lot) are sourced within Wikipedia and I always noticed a small boost in rankings after they had linked to a page. When they introduced nofollow, many pages suffered.

Google has now made it public, if I understand correctly, that nofollow shoves PageRank into an empty hole. That in its self has turned the entire scheme of things upside down. Else if I'm incorrect in my assumption, PageRank just stays on a page.

If it *does* just stay on a page, then would it be fair to assume that if:

Page A has 10 outgoing links.
Page A has nofollow'd 5 links.
Page A would pass 5 links off.
Page A would render its self incapable of sharing 5/10 points. Something like in the day when phpbb's site was barred from passing PR because of link selling.

[edited by: Andem at 6:35 pm (utc) on June 17, 2009]

tedster

7:05 pm on Jun 17, 2009 (gmt 0)

Ok what about crawling, do you guys know if the bot will still crawl those links with rel=nofollow?

The bot will crawl those urls if there is any reference to them elsewhere on your site or anywhere else on the web. But if ALL links to those urls are nofollow, and if there NEVER was a dofollowed link, then Google does not use nofollow links for url discovery.

alahamdan

7:35 pm on Jun 17, 2009 (gmt 0)

So what i mean here,

Crawling more may index more pages, if i have a forum for example.

Lets say my forum get crawled for sure from time to time.

if my forum get some Nofollow links (Plus other dofollow links), this will let the bot crawl it more each time it see the nofollow link?

tedster

7:59 pm on Jun 17, 2009 (gmt 0)

I don't think so - nofollow was named what it is for a reason.

Also, googlebot crawling is not exactly the same as a user clicking on links. There is a pre-programmed budget or set of urls for any given spidering session. That agenda is also influenced by other factors, including your server's record of response times, history of page changes, and many other things I'm sure.

-------

Earlier I gave a rather short reply about javascript links I'd like to expand on. It's no longer true that Google "doesn't read javascript" or even that Google "doesn't execute javascript". In some cases today, they do.

If Google discovers that a given script creates a link-like effect for the visitor, they can establish a virtual link to represent that click-point -- and that virtual link becomes an element in the web graph. It will take a share of PR and it can vote that PR to its target page. This same principle goes for urls discovered through automated form submissions - you can get a virtual link added to the web graph.

alahamdan

8:23 pm on Jun 17, 2009 (gmt 0)

tedster

Thanks for your reply

"That agenda is also influenced by other factors, including your server's record"

Im using a windows server, lets say every 1 month, windows update and the server need a restart, this will give a bad signal for the bot? what do you think?

tedster

9:25 pm on Jun 17, 2009 (gmt 0)

There's a whole lot of server factors, alahamdan, including whether 304 responses are enabled, if compression in use and lots more. But let's not take this thread off topic - we're discussing PageRank changes here.

I remember early in 2008 one of Google's big boys mentioned that they'd made some changes to PR calculation, but we couldn't easily sort out exactly what they were here. The signal was kind of swamped by other changes. Now we know at least a little bit.

There was an even earlier shift right at the time the Big Daddy infrastructure was deployed. That change allowed Google to shift from monthly PR calculation to nearly continual calculation - a whole new mathematical model. I remember reading at the time that this continual calculation could accumulate small errors because it was no longer an iterative computation. So, if I'm remembering correctly, every once in a while a correction would be calculated and combined back into the data.

I'm pretty sure that this approach has also been refined quite a bit. Despite all the "PR doesn't matter" ranting we read, PR still does matter, and it's still the core insight that took Google a giant leap beyond the other engines of the day.

alahamdan

9:38 pm on Jun 17, 2009 (gmt 0)

Yes

PR still matter i agree

For me with every PR update, i can see a significant change in traffic.

And about continual calculation, if you remember before few months, the page rank update before the last one, i added few pages - static pages, before 48-72 hours of PR update, totally new links and content, and they ranked 1 and 2, you replied to that post that day i don't know if you remember.

steveb

10:02 pm on Jun 17, 2009 (gmt 0)

"Do you believe that optimal SEO design is to have only one internal link per page to each other page?
Because that is what steveb's comment seems to imply."

I didn't say or imply any such thing. In fact I haven't said anything about overall optimal SEO design at all!

I can't imagine what you are thinking about here.

steveb

10:15 pm on Jun 17, 2009 (gmt 0)

"Still, there are often good reasons to have duplicate links - for example, an article might crosslink to other articles in the same silo within the body text as well as in nav. Steveb seemed to be saying that would be non-optimal."

It's non-optimal from a PR perspective, but might be user friendly. What would be much more non-optimal (and anti user friendly) is if you had five, or fifty, duplicate links.

If you have 75 links as part of your basic site navigation, adding one more in the body text isn't going make much difference. Adding 50 more links to the same page will make a huge difference.

Even if there are complexities, like if it is overall better for you to duplicate a link in body text (negative in terms of PR, but positive in temrs of other algo elements), the basic point of a clean design without useless redundancy or clutter remains a generally good idea.

Tonearm

10:50 pm on Jun 17, 2009 (gmt 0)

I don't understand why Matt Cutts says he might put nofollow on links to unimportant pages like shopping cart, etc. Wouldn't it be smarter to put noindex on the page, so pagerank can flow into and out of the page?

docbird

2:19 am on Jun 18, 2009 (gmt 0)

I noticed big Google search result boosts for a few pages for which I added homepage links (rather than linking via section-category-page), so favour links from homepage for important pages. Tho try at same time not to go overboard w homepage links: can be more than Tedster suggests, but not gazillions.

Any notions Google might look at links in comments differently? (whether or not "nofollow" used for links there)

tedster

2:22 am on Jun 18, 2009 (gmt 0)

can be more than Tedster suggests

Yes, most definitely. I aim for that zone on redesigns because it leaves some wiggle room for the future.

freejung

4:05 am on Jun 18, 2009 (gmt 0)

I didn't say or imply any such thing... I can't imagine what you are thinking about here.

Good, that's what I was hoping you'd say.

I inferred that from "the PR evaporates" when you have duplicate links. I'm glad you didn't mean to imply that duplicate links are necessarily bad design. It wouldn't make much sense.

francie brady

4:12 am on Jun 18, 2009 (gmt 0)

If I have a well designed site architecture utilizing themed silos with little cross linking (especially deep into a silo).
Am I ok:
- NOT no following nav bar links that are loaded via iframes?
- NOT no following nav bar links that are already directed to be nofollowed in robots.txt?

thanks :-}

tedster

4:40 am on Jun 18, 2009 (gmt 0)

Interesting questions.

NOT no following nav bar links that are loaded via iframes

If there is no other link to that iframed page, then you've got an equivalent situation to using nofollow on all the iframed links - nothing is passing PR. To the degree that you do have links to the iframed url (for instance in an html sitemap) then you are sending through some PR. I'm not clear what you're goals are - and what you mean by "Am I safe".

NOT no following nav bar links that are already directed to be nofollowed in robots.txt?

I assume you mean disallowd in robots.txt - since nofollow doesn't exist in the robots.txt syntax. A url that is disallowed on robots.txt can still accrue PR, it just won't be spidered so it can't pass any influence.

Again, I'm not sure what you mean by "safe". But overall, loading main navigation through an iframe seems a bit tight on PR circulation to me. It sounds to me as though inner pages might have a struggle to rank - in other words, it's the home page or nothing.

If you also use significant linking to those iframed anchors in the content area, then you might be OK - but you are also making things a bit complex and touchy for future maintenance. It sounds to me like you will run the risk of accidentally cutting off important pages from important link juice with future content area changes.

If this iframed navbar is main navigation, why not just circulate PR instead of having concern about it? If easy maintenance of the navigation area is the point, using includes on the server-side is probably a better way to go.

CainIV

7:20 am on Jun 18, 2009 (gmt 0)

#:3934940 - Excellent post Tedster, and one that every SEO should really take time to fully understand when researching for execution a large website build.

steveb

7:32 am on Jun 18, 2009 (gmt 0)

"I'm glad you didn't mean to imply that duplicate links are necessarily bad design."

Duplicate links are bad design, but that doesn't have anything to do with the other thing you mentioned.

===

(Oh, if you are saying that I said if you have 75 links on a page, the best structure is for each of them them to go to 75 unique URLs, then yes of course. Duplicating links is self-destrutive in general, even if in very rare instances you do it.)

freejung

8:02 am on Jun 18, 2009 (gmt 0)

Oh, if you are saying that I said if you have 75 links on a page, the best structure is for each of them them to go to 75 unique URLs

Yes, that is what I meant, sorry if I didn't make that clear.

Duplicate links are bad design, but that doesn't have anything to do with the other thing you mentioned.

What is it to do with then?

This 101 message thread spans 4 pages: 101