I need to block the page rank leak from my homepage

Forum Moderators: open

Message Too Old, No Replies

I need to block the page rank leak from my homepage

How do restrict specfic pages to be crawled

illusionist

1:02 am on Jan 27, 2004 (gmt 0)

I tried this formerly <a href="javascript:goToPage('123/123.htm')">Disclaimer</a> (this doesnt work, google scans throught this too) I need to specific out going links to be blocked, how to do this?

doc_z

2:23 pm on Jan 29, 2004 (gmt 0)

Yes, I realize the whole thing is just a mathematical model and the final results can be calculated in various ways. But my point is that PR arriving at a non-indexed page is lost to the site, and this can be significant. For example if a typical page has say 10 outgoing links to indexed pages, but 5 to non-indexed pages (e.g., terms, privacy, about, contact, copyright) then on each iteration the PR lost to the site is PR * d * 1/3.

1. The original question was about the change for the homepage and not the whole site. Therefore, msg #24 was given the result for the homepage.

2. Even the whole site is not only loosing a PR of

d * PR_Home * y / ( x + y)

(the PR which would be transferred to the noindex, nofollow pages), but

d / (1-d) * PR_Home * y / ( x + y)

which is significant higher. Therefore, the result in your example is PR_Home * d/(1-d)/3. (You have neglected higher order effects, i.e. the PR which is lost due to the fact that these page can't distribute PR.)

3. To say it again: Iterations are related to the scheme which is used to solve the set of linear equations. This is only a technique to speed up the calculation process and doesn't have any meaning. There are several possibilities to solve these equations and the Jacobian algorithm (which is mentioned in the original papers) is just one method. You can even get the exact (final) result in one step without any initial guessing or any iteration. However, this would be computationally much more expensive.

HarryM

2:54 pm on Jan 29, 2004 (gmt 0)

doc_z,

I did not "neglect higher order effects". For the sake of simplicity I excluded them by stating that my formula applied only to "each iteration".

However it doesn't really matter whether my calculation is accurate or not - it was just an example to point out that PR leakage can be significant. The title of this thread is about the need to block page rank leakage, not about how to calculate it.

My question is related to the original although not precisely the same, so perhaps I should start a different thread.

Harry

tombola

2:59 pm on Jan 29, 2004 (gmt 0)

I don't see how a redirect would be beneficial. Google would still see it as a link.

Well, instead of an external link (a link to another domain/site), it would become an internal link.
Instead of linking to www.yourdomain.com:

I could use this link to my own domain to get the same result:

This way PR will not leak to www.yourdomain.com.

Jessica

3:22 pm on Jan 29, 2004 (gmt 0)

I could use this link to my own domain to get the same result:
<a href="http://www.mydomain.com/redirect.cgi?www.yourdomain.com">Link</a>

and how to set up such redirect?
make a new file? what to put in there?

thanks.

HarryM

3:30 pm on Jan 29, 2004 (gmt 0)

This way PR will not leak to www.yourdomain.com.

True, but PR will still be lost because the link will still be included in the PR calculations.

tombola

3:46 pm on Jan 29, 2004 (gmt 0)

You can write a small script ("redirect.cgi" in this example) that redirects the user to the url specified after the quotation mark ("www.yourdomain.com").

In Perl, this redirect.cgi script contains only 2 lines:

$url = $ENV{'QUERY_STRING'};
print "Location: $url\n\n";

tombola

3:49 pm on Jan 29, 2004 (gmt 0)

True, but PR will still be lost because the link will still be included in the PR calculations.

Yes, but I rather lose PR to a page within my own domain, than to a page on another domain :-)

rogerd

4:03 pm on Jan 29, 2004 (gmt 0)

I wouldn't assume that you won't transfer PR to the remote page with that kind of redirect, particularly if there's a plain text URL for the remote site in the script link.

tombola

4:39 pm on Jan 29, 2004 (gmt 0)

If you don't see the benefits of redirect scripts, I suggest to ask Brett why he uses these scripts on Webmasterworld ;-)

Also, you don't have to include a full url as part of the redirect-string. There are other ways too.

seofreak

6:05 pm on Jan 29, 2004 (gmt 0)

<a href="http://www.mydomain.com/redirect.cgi?www.yourdomain.com">Link</a>
This way PR will not leak to www.yourdomain.com.

well true that in some cases. but that's not how it always works. I have seen PR leak from such links if 1 of the 2 conditions meet. I am surprised no one else has noticed it yet. It's been there for a long time.

Herenvardo

9:28 am on Jan 30, 2004 (gmt 0)

<a href="http://www.mydomain.com/redirect.cgi?www.yourdomain.com">Link</a>
This way PR will not leak to www.yourdomain.com.

well true that in some cases. but that's not how it always works. I have seen PR leak from such links if 1 of the 2 conditions meet. I am surprised no one else has noticed it yet. It's been there for a long time.

I'm not sure, but I think that with such a method, PR doesn't leak to yourdomain but it does to the file redirect.cgi, wich I don't believe has many links.
Maybe I'm wrong, it's only an interpretation that seems the most logical to me.

Greetings,
Herenvard�

HarryM

11:29 am on Jan 30, 2004 (gmt 0)

The original post concerned the need to block page rank leakage. Where it leaks to is a side issue.

As far as I can see the only way this can be done is if a search engine can be fooled into not recognizing the link as a link, and therefore not including the link in it's PR calculations. But that means you can't use <a href= or 'onclick', etc.

Not only would that be immoral in the Great God Google's eyes, no doubt resulting in a blast of divine lightning, but I also suspect it's impossible. The only thing that can be done is to minimise the number of unhelpful links or minimise their effects by putting them on deep low-PR pages.

seofreak

6:28 pm on Jan 30, 2004 (gmt 0)

I'm not sure, but I think that with such a method, PR doesn't leak to yourdomain but it does to the file redirect.cgi, wich I don't believe has many links. Maybe I'm wrong, it's only an interpretation that seems the most logical to me.

Ofcourse I check the backlinks of the site that is launched by the redirect script to confirm.

bull

9:36 am on Jan 31, 2004 (gmt 0)

Do links to
http*//validator.w3.org/check/referer
and
http*//jigsaw.w3.org/css-validator
pass pagerank and cause leaks or are there exceptions?

Herenvardo

9:44 am on Jan 31, 2004 (gmt 0)

But that means you can't use <a href= or 'onclick', etc.
Not only would that be immoral in the Great God Google's eyes, no doubt resulting in a blast of divine lightning, but I also suspect it's impossible.

Impossible? Anything is impossible! Maybe we are not able to find a way, but it's not impossible.
What 'bout using a java-applet to put the link in? Is G able to spider bytecodes?
I hope there will pass some time untill G can do that... time enough to search for a new solution ;) That's SEO!

Greetings,
Herenvard�

HarryM

6:24 pm on Jan 31, 2004 (gmt 0)

What 'bout using a java-applet to put the link in?

Not everyone will see it. I have my Norton security set to 'high' which automatically bans Java applets. Even with 'medium' where the user is given the option, the recommendation is to ban Java applets which probably 90% of users obey.

dirkz

7:41 pm on Jan 31, 2004 (gmt 0)

T_Rex, interesting script.

Am I right in concluding they probably pass PR to a page of their own (www.theirURL.com/links.htm), but will "redirect" on click (ok, it's no official one) to www.MySuckerSite.com?

This is very bizarre and also very misleading for bots. I personally would drop them.

But technically this answers the original question as well as HarryM's.

HarryM

1:03 am on Feb 1, 2004 (gmt 0)

T_Rex's script certainly looks as if it might be the answer. But I don't think I want to go down any route that Google might consider cloaking.

I have been considering if it is possible to recover some of the PR wastage to non-indexed pages. As an experiment I have set up a new page with meta set to 'noindex', from which hangs a new indexable page. If the page obtains PR then it can only be coming via the 'noindex' page. Just have to wait and see.

Harry

Patrick Taylor

4:58 am on Feb 1, 2004 (gmt 0)

Redirect php scripts (as separate files) are common enough, as a way to build links, and I don't see how Google's robot can possibly follow the link to the URL being linked to, especially if the actual URLs are held in a separate text file.

The question is whether PR is nonetheless leaked from the page containing the link, because if it is, where is it being leaked to?

Herenvardo

9:56 am on Feb 2, 2004 (gmt 0)

Not everyone will see it. I have my Norton security set to 'high' which automatically bans Java applets. Even with 'medium' where the user is given the option, the recommendation is to ban Java applets which probably 90% of users obey.

I don't understand why to ban Java applets. They are as secure as plain HTML! If they try to do something that could be risky, the user is asked.
Ok, so Java is discharted... you can try with a server-side script language, like asp or php. It would take a bit more effort, but it's possible.

Hoping be useful,
Herenvard�

jaffstar

10:31 am on Feb 2, 2004 (gmt 0)

I discovered a site at the top of the SERPS, using the following code. They provide 10 links from their homepage to 10 pages using meta refresh command.

Could this Help with PR drain?

</html>

dirkz

7:32 pm on Feb 2, 2004 (gmt 0)

> Redirect php scripts (as separate files) are common enough, as a way to build links, and I don't see how Google's robot can possibly follow the link to the URL being linked to, especially if the actual URLs are held in a separate text file.

Doesn't matter where they are kept, in the end they are communicated (e.g., via 302), so every bot can see it.

One exception is of course JavaScript.

dirkz

7:34 pm on Feb 2, 2004 (gmt 0)

> Could this Help with PR drain?

How so? A bot can easily spot the redirect. Or did I miss anything?

illusionist

12:18 pm on Feb 3, 2004 (gmt 0)

so four page discussion and still no answer to my question? how do i stop the leak?

Chelsea

12:19 pm on Feb 3, 2004 (gmt 0)

Don't stop the leak.

steveb

12:40 pm on Feb 3, 2004 (gmt 0)

"how do i stop the leak?"

Don't link.

Chelsea

12:49 pm on Feb 3, 2004 (gmt 0)

Or if you want to be absolutely certain, take your site down :)

Herenvardo

9:10 am on Feb 4, 2004 (gmt 0)

illusionist, this is the best solution I've found. It can be considered cloaking, but it might work.
Make a php page. Since it's server side, you have not to worry about users that can or cannot view it. At the beginning of the script identify your visitor (getting it's IP, its browser name and version or using any other trick) and determine if it is a bot.
The, inside your code, you can put something like:

if (!bot){
?><p><a href="linkURL">link text</a> surrounding text </p><?
}

If you have a variable called bot that is true when a robot is detected, you will be giving that link only to users, not to bots.
Technically, cloaking is giving something different to users that to bots, so this is pure cloaking. But it's also difficult to detect. It's risky, but it's a possible solution. If anybody have a better idea, I'll listen it.

Greetings,
Herenvard�

taxpod

10:34 am on Feb 4, 2004 (gmt 0)

>>Finally, the "page rank leak factor" is not something you need to worry about.

>If it's not an issue, then why does my homepage drop more than 20 positions every time I link to an external page from my homepage?

First ask yourself what has PR to do with serp position. My answer is: not a whole heck of a lot. It hasn't had a lot to do with serp position for quite some time.

There are reasons to do the things that get PR, to a point. But linking out causing so called PR leak, causing a page to drop in the serps is IMHO absurd.

Linking out to topically unrelated material causing a page to have its theme watered down, causing a site to drop in the serps for a particular keyword combo makes more sense to me.

Patrick Taylor

12:58 pm on Feb 4, 2004 (gmt 0)

dirkz: Doesn't matter where they are kept, in the end they are communicated (e.g., via 302), so every bot can see it.

Please explain. Where is the connection between the link on the web page (eg: "view.php?string" - which is a link to an external php redirect script that contains no URLs) and a specific URL in a list of URLs in a separate text file (which is a file referred to in the php script, but not linked to)?

This 77 message thread spans 3 pages: 77