I need to block the page rank leak from my homepage

Forum Moderators: open

Message Too Old, No Replies

I need to block the page rank leak from my homepage

How do restrict specfic pages to be crawled

illusionist

1:02 am on Jan 27, 2004 (gmt 0)

I tried this formerly <a href="javascript:goToPage('123/123.htm')">Disclaimer</a> (this doesnt work, google scans throught this too) I need to specific out going links to be blocked, how to do this?

bluenile

3:27 am on Jan 27, 2004 (gmt 0)

[webmasterworld.com...]

Marcia said

The number of links on a page isn't relevant to the amount of Page Rank the page has; that depends on the inbound links the page is receiving to it from other pages.
How many outbound links there are can affect how much PR is being conferred upon the pages being linked out to from that page - including other pages on your own site. So from that aspect if you decrease the number of outbound links on a page to increase the value of each you could indirectly benefit from getting fed back a little more PR to that page from the other pages of your site.
But is it only PR you're looking at for its own sake, or are you also looking at other factors that can affect the scoring of a site overall?

seofreak

4:30 am on Jan 27, 2004 (gmt 0)

works for me.

dorjesempa

5:34 pm on Jan 27, 2004 (gmt 0)

Many thanks seofreak for that code snippet. As a javascript newbie, I was wondering how to modify this code so that the link opens in a new window?

many thanks in advance,

dorjesempa

GoogleGuy

5:48 pm on Jan 27, 2004 (gmt 0)

I wouldn't assume that we can't scan through JavaScript for links. Techies can leave the the np-complete jokes aside ;) My recommendation would be to put the links or redirects through a page which is forbidden by robots.txt. You can say "nofollow" to exclude links from a whole page, but I don't think we have that granularity for individual links. That's a good suggestion though..

glitterball

6:00 pm on Jan 27, 2004 (gmt 0)

Is GG then confirming that 'Page Rank Leak' does exist?

I think if GG is telling us how to stop it, then Page Rank Leak mustn't matter very much anymore.

dirkz

6:52 pm on Jan 27, 2004 (gmt 0)

> I need to specific out going links to be blocked, how to do this?

You can block the target page with robots.txt

illusionist

7:38 pm on Jan 27, 2004 (gmt 0)

the link goes to a diff site from my homepage. robots.txt is out of the question.

dirkz

8:00 pm on Jan 27, 2004 (gmt 0)

I don't want to sound like your parents, but :)

I don't think you should do this. The leakage is neglectible, and you never know what you will get back.

If you don't want to let PR flow, why do you link?

ogletree

8:14 pm on Jan 27, 2004 (gmt 0)

What gg is saying is that you have the link go to a page on your site that then redirects outside the site.

Mr_PHP

11:40 pm on Jan 27, 2004 (gmt 0)

Hi dorjesempa,

Try this:
<a href="#" onClick="window.open('http://www.domain.com','_blank');return false;">text link</a>

Best regards, Arno

HarryM

11:46 pm on Jan 27, 2004 (gmt 0)

Try this:

href="javascript:gohome();"

With the function in an externally linked js page:

function gohome()
{
window.location.replace("../index.php");
}

I used to use this for a while to retain PR at the coal face and stop it leaking back to the index page. But that was in my wild wild youth and I'm a reformed character now. :)

Harry

apollo

3:28 am on Jan 28, 2004 (gmt 0)

GG said:
>I wouldn't assume that we can't scan through JavaScript for links.

The aspect I have always found interesting about this is that on the one hand google apparently has a penalty for 'cross-linking' (that is, attempting to gain a google ranking benefit through interlinking a lot of different sites), but if you want to offer your users the ability to click through to related sites and attempt to hide these links from google so google does not think you are using these links to obtain an advantage on the search engine, then as GG points out google might still nevertheless look for the links that you are trying to hide from it to avoid any penalty and then use the existence of these links to impose such a penalty.

It is in this way that google is determining website design.

You cannot link related sites together how you want to do it, you must do it in such a way that google does not think you are doing it to gain an advantage on the search engine rankings.

If you actually try to hide the links, and in that way make clear that you have absolutely no interest in using the links to gain any advantage on the search engine, and your intention is simply for users to click on them, then google still says that they might track them anyway and you still might be subject to a penalty.

hutcheson

6:30 am on Jan 28, 2004 (gmt 0)

>If you actually try to hide the links, and in that way make clear that you have absolutely no interest in using the links to gain any advantage on the search engine, and your intention is simply for users to click on them, then google still says that they might track them anyway and you still might be subject to a penalty.

Google didn't say anything of the sort.

What was suggested was in fact JavaScript, and Googleguy simply reminded you that Google is known to scan through JavaScript looking for character strings with URL-looking bits in them.

And nothing was said about penalizing: and the only penalty Google has ever mentioned for linking out is going to a "bad neighborhood" -- in other words, "doorway spam mazes" or "quantum link black holes." Those aren't the kind of sites ANYONE could EVER link to by accident!

Finally, the "page rank leak factor" is not something you need to worry about. If you get more than 6 million outgoing links (like dmoz.org), you may have to worry about your home page rank dropping below 10. Otherwise, don't sweat it. (Assume a probability of, say, one billionth for each of your pages (almost surely several orders of magnitude too large), and calculate how much page rank they gain if they capture all the PRL out to infinity. For all practical purposes, it's page rank squared * number of pages, which is, on your typical 1000-page-doorway-maze site, zero to six significant figures. Find a high school junior who's taking the college prep math courses, and let them do the math, if it's over your head.)

This is not an issue!

[edited by: Marcia at 9:20 am (utc) on Jan. 28, 2004]

glitterball

8:44 am on Jan 28, 2004 (gmt 0)

>Finally, the "page rank leak factor" is not something you need to worry about.

If it's not an issue, then why does my homepage drop more than 20 positions every time I link to an external page from my homepage?

ThomasB

8:46 am on Jan 28, 2004 (gmt 0)

We practice what GG mentioned since a couple of months with success. So just listen to him and follow his suggestion. If it doesn't work, blame him. :)

Marcia

8:54 am on Jan 28, 2004 (gmt 0)

Let's get it straight about this. If we have

Then Google will index the page and not follow the links? I always assumed it meant what it said, but was told by a highly trusted source that "follow" or "nofollow" is ignored by Googlebot.

ThomasB

9:28 am on Jan 28, 2004 (gmt 0)

Marcia, I think it's about blocking specific URLs to be followed. And index,nofollow doesn't help for that problem. Even if nofollow would work.

Marcia

9:44 am on Jan 28, 2004 (gmt 0)

Right, I understand it's about specific links and sometimes it can be a good idea - like links into a big dynamic shopping cart that's a worthless black hole.

But I still want to get it straight about index,nofollow because I always assumed Googlebot did just that if that's what was there and I was told it's not so.

steveb

10:05 am on Jan 28, 2004 (gmt 0)

"then why does my homepage.."

Coincidence. If you are talking about ONE link, the PR loss is absolutely trivial. The site you link to might be in a bad neighborhood or something.

Herenvardo

10:58 am on Jan 28, 2004 (gmt 0)

�So big is your PR leak to worry so much about it?
PR leak is not trully a loss of pagerank. It's simply that there's theroically possible to increase the file PR by modifying link structure.
Basically, if your have an outbound link in your home, then the amount of PR that will get other files linked from home will be a bit fewer, so this files will be have a bit lower PR value and will pass a bit less of PR if they link back to home.
The amount of PR leak can even be calculated:
(Total PR of home) * (Number of outbound links from home) * (Damp factor ~ 0.85) / (Number of internal links from home)
So, to minimize the leak, a good way is to maximize the number of internal links from your home.
Note: this calculation assumes that all your internal pages are linked to your home. If it's not the case, the calculation gets more complex.
Anybody who wants more details 'bout this calculation, sticky me.

Hoping to be useful,
Herenvard�

danny

11:55 am on Jan 28, 2004 (gmt 0)

I just want Google to stop following AdSense alternative_ad links! I (and I assume most AdSense publishers) have alternative ads designed to work in the appropriate size iframe, which don't work properly as independent pages.

Hagstrom

11:55 am on Jan 28, 2004 (gmt 0)

> Marcia > Right, I understand it's about specific links and sometimes it can be a good idea - like links into a big dynamic shopping cart that's a worthless black hole.

The way I see it - any page that is prohibited by robots.txt will be a "worthless black hole" - since it cannot pass PR.

That's why I don't like Googleguy's suggestion - you'll still loose PR linking to your "worthless black hole" of a re-direct page.

> Marcia > But I still want to get it straight about index,nofollow because I always assumed Googlebot did just that if that's what was there and I was told it's not so.

The question is, what "nofollow" means:

Do not look at the pages that I link to
Do not give the pages that I link to any PR from me

doc_z

12:31 pm on Jan 28, 2004 (gmt 0)

The amount of PR leak can even be calculated:
(Total PR of home) * (Number of outbound links from home) * (Damp factor ~ 0.85) / (Number of internal links from home)

Taking the current PR algorithm and

- having i external links on the homepage
- j internal links on the homepage
- all external incoming links are going to the homepage
- there are no reciprocal links
- all pages of the site linking to each other page of the site

you'll get:

PR_Home = PR_X / ( 1 - d^2 / ((1-d*(j-1)/j)*(i+j)))

where PR_X denotes the PR transferred from external pages and d is the damping factor. You can easily see that increasing i reduces PR_Home.

You can block the target page with robots.txt

This would be a waste of PR.

ukgimp

12:47 pm on Jan 28, 2004 (gmt 0)

>>I wouldn't assume that we can't scan through
>>JavaScript for links

Dont need to assume. I have seen it happen myself. A friend left a link in some JS that was found and crawled etc etc

HarryM

12:21 am on Jan 29, 2004 (gmt 0)

Let me ask a simple question which is bothering me.

If a site has a page (at any depth) with x links to other normal site pages and y links to pages which have meta noindex, noarchive, nofollow, my understanding is that the PR passed to the normal pages = page PR * damping factor / (x + y). The PR lost is page PR * y/(x + y). And that this happens at every iteration.

In my case I have a high proportion of print friendly duplicate pages which are currently banned, so the loss may be significant. They have 1 incoming link from the original page and 1 outgoing link back. Would setting meta to follow allow some of this PR to be passed back into the system?

If nothing can be done, then would I be risking a penalty for duplicate content if I allowed Google full access?

The same logic applies to links to copyright, terms, and other such pages.

Harry

doc_z

8:59 am on Jan 29, 2004 (gmt 0)

The PR lost is page PR * y/(x + y).

d * PR *y / ( x + y) is indeed the PR transferred from that page to the noindex, nofollow pages, but it isn't the PR which is lost for the linking page.

And that this happens at every iteration.

Iterations doesn't have any meaning. This is just a technique to solve a set of linear equations. You have to look at the final (stable) values and compare the results for different linking structures, e.g. no links to noindex, nofollow pages (y=0) compared to n links.

By the way, the PR values can also been calculated within one step without any iterations.

HarryM

1:01 pm on Jan 29, 2004 (gmt 0)

d * PR *y / ( x + y) is indeed the PR transferred from that page to the noindex, nofollow pages, but it isn't the PR which is lost for the linking page.

Yes, I realize the whole thing is just a mathematical model and the final results can be calculated in various ways. But my point is that PR arriving at a non-indexed page is lost to the site, and this can be significant. For example if a typical page has say 10 outgoing links to indexed pages, but 5 to non-indexed pages (e.g., terms, privacy, about, contact, copyright) then on each iteration the PR lost to the site is PR * d * 1/3.

However I am much more interested in advice as to what to do about this leakage, rather than worrying about how it is calculated.

Harry

tombola

1:20 pm on Jan 29, 2004 (gmt 0)

If you don't want to have outgoing links at all, you can use a redirect script - like Webmasterworld does ;-)

HarryM

1:36 pm on Jan 29, 2004 (gmt 0)

I don't see how a redirect would be beneficial. Google would still see it as a link.

However I have now noticed that WebmasterWorld only has 'noindex' on its print friendly pages, and not 'nofollow'. Does this answer my question?

This 77 message thread spans 3 pages: 77