Pagerank Secret Revealed, again

Forum Moderators: open

Message Too Old, No Replies

Pagerank Secret Revealed, again

cgi links count, but don't show as backlinks

steveb

1:57 am on Jan 10, 2003 (gmt 0)

... unless it is just a glitch, but I don't think so.

There is a site, which i can't give any details about, which has 200 backlinks showing, including two PR6 and mostly PR4's. It has no PR7 or higher links.

But it has a PR7.

The only possible explanation besides a glitch (unlikely) is that this site has about 30 (could be many more) inbound PR6, 1500 inbound PR5 (could be twice that) and several thousand inbound PR4 links that come in via cgi redirect, ad servers or other referal method.

Taking a step back, this site *should* be PR7 with all these links going to it.

Google has found a way to give this site the PR it deserves, but it doesn't show us the backlink reasons "why". I believe this is an enormous bit of information with lots of repercussions for sites that offer a lot of cgi links to other sites.

heini

11:51 pm on Jan 11, 2003 (gmt 0)

Not really sure if I understand you correctly, but redirect links are known to count to a certain extent.

Discussing the PR directories pass on which use jump urls the conesensus seemed to be, yes they do, if their own PR is high enough.
I can confirm to that from a large directory, whose cgi links sometimes even showed up as backlinks in Google.

steveb

2:05 am on Jan 12, 2003 (gmt 0)

I'm saying cgi and adserver links pass on PR, even if they don't show.

I'm saying the backlinks shown via the toolbar may only be a very small fraction of the backlinks being actually counted by Google.

I'm saying looking at the backlinks for some particular types of sites is completely useless in analyzing that site.

I'm saying that if you are a site that offers cgi advertising type links to some sort of business entity that you should include in what you charge them some calculation for PR that you are giving them.

And finally, I'm saying that all the concerns people have about getting links to them via cgi "masked" redirects should not concern people at all, because even if you don't see the links listed as backlinks they are still being counted by Google.

jomaxx

2:48 am on Jan 12, 2003 (gmt 0)

Not if the /cgi-bin/ directory (or wherever the redirect script resides) is blocked by the robots.txt file.

steveb

2:58 am on Jan 12, 2003 (gmt 0)

"Not if the /cgi-bin/ directory (or wherever the redirect script resides) is blocked by the robots.txt file."

Not true. My cgi bin is blocked by robots.txt but on at least one site I link to via a cgi redirect all my backlinks are shown.

Dante_Maure

3:01 am on Jan 12, 2003 (gmt 0)

The only possible explanation besides a glitch...

"There are more things in heaven and earth, Horatio,
Than are dreamt of in your philosophy."

Lots0

3:21 am on Jan 12, 2003 (gmt 0)

I go along with heini, redirect links (cgi) do carry some PR weight - the PR weight that these links carry has varied almost every month for at least the last 9 months. I believe that currently the PR weight that cgi links carrying is at an all time high - but will most likely fall again next month, if it follows the usual pattern.

jomaxx

6:06 am on Jan 12, 2003 (gmt 0)

steveb, Googlebot respects the robots.txt file and won't execute anything in the the cgi-bin if it's blocked. So how can it know where the link leads?

The only *possible* way this could happen, I don't know if it would show up as a backlink or not, is if the redirect URL included a parameter such as "?redirect=www.domain.com".

steveb

6:32 am on Jan 12, 2003 (gmt 0)

site.com/cgi-bin/redirect.cgi?link=http://www.thiersite.com/index.html?1000

This shows as a backlink at theirsite.com even though I disallow my cgi-bin. Sticky me if you want the URL of theirsite.com

tristan

3:21 pm on Jan 12, 2003 (gmt 0)

cgi links with a blocked cgi-bin do count as links if the
url is in the script parameter,
for example:
/cgi-bin/link.cgi?http://www.somesite.com/

/cgi-bin/link.cgi?site=http://www.somesite.com/

will all be picked up by google as links, even if your cgi-bin is blocked by a robots.txt

chiyo

3:28 pm on Jan 12, 2003 (gmt 0)

tristan. are you absolutely sure? It seems to go against common understanding of robot.txt functionality and specification. We have robotted txtsed out our cgi-bin for years and dont see any evidence that this directory was being followed by Google or any robots that respect robots.txt.

Robots.txt is only advisory. Its still open for any spider to spider it, and you know what some people are like! Are you sure that this was just the case with a rogue spider from an email crawler or the like?

hurlimann

3:30 pm on Jan 12, 2003 (gmt 0)

Not sure if this is the same but a competitor has pos 1 and 2 for a lot of keywords from pages that are are www.DOMAIN.com/cgi-bin/profile.cgi?s=XY .

All the pages are greybar and are not optimised and his static homepage is an unoptimised PR3.

nyehouse

5:35 pm on Jan 12, 2003 (gmt 0)

i also think that a target=blank link does not show up in Google in the backlinks or in the pagerank.

I have a client who has no backlinks on his site, yet another site links to him via an clickeable image. I know it cannot be the image that google does not support since I have other images that show up in backlinks on other sites. Both sites have been indexed by google, and both pages have been live for a long time. The incoming link is on the homepage of the other site, and the site does not have many links. The site which hosts the incoming link is only a PR1, so maybe google doe snot track links from PR1 or lower sites? The only thing I can think of is that the target=blank (opening a new window) is not followed by google.

Anyone else have any experience with this. Maybe this should be a new thread.

Tim

jomaxx

6:14 pm on Jan 12, 2003 (gmt 0)

Google won't display backlinks from PR1 sites (although the pagerank should be counted OK). It only displays as low as PR2 or PR3. I doubt very much the "TARGET=" affects anything.

jomaxx

6:45 pm on Jan 12, 2003 (gmt 0)

P.S. Chiyo, I think what they're saying is that Google won't execute anything it's asked not to, but it can INFER that the script is a redirect to another page based on the presence of a full URL.

That makes some sense and is what I was trying to clarify with my earlier post. Thinking about it now, I remember that Yahoo links out in exactly this way, and we know that Google can spider Yahoo with no problems.

tristan

9:32 pm on Jan 12, 2003 (gmt 0)

>tristan. are you absolutely sure? It seems to go against common understanding of robot.txt functionality and
>specification. We have robotted txtsed out our cgi-bin for years and dont see any evidence that this directory was
>being followed by Google or any robots that respect robots.txt.

I'm not saying googlebot actually downloads/follows the cgi links, just that
googlebot makes an assumption that it is a link to a certain page
based on the fact that the script parameter is an url
I've found that the only way to hide a link for googlebot for 100%
is to use a code in the link (a database key or smth similar), so
that your links look like
/cgi-bin/link.cgi?p=123

(together with blocking your cgi-bin, or at least your link script
in your robots.txt)
I've read other threads here saying you can also do this with
javascript, but then I think it's just a matter of time before
googlebot can follow these links too, so better safe then sorry

gollyme

10:03 pm on Jan 12, 2003 (gmt 0)

This is related to the "problem" I posted a while ago. Restricting restricted pages from the bots.

Hmmm. Using a code in the link huh? I think its also being followed by the bot or spiders... Can they login virtually too *grin*