|Outoing Links through CGI|
Do they count?
| 6:22 pm on Feb 11, 2004 (gmt 0)|
I've been searching and reading through numerous posts, but couldn't determine what the situation is with the case of a directory like ours that has thousands of outgoing links that run through a perl script that allows us to keep track of the number of clicks on each outgoing link. In other words these links look something like:
This then sends to the user to the correct website and is basically an outgoing link that is first processed through the hit counting script.
My question is whether Google does give credit to a site we link to in this way? In other words does our link in this way affect their site PR in the same way that it would if we used a standard link that doesn't run through the perl hit counting script?
And for our site PR are these types of links that run through a hit counting script really considered internal links by Google, and not outgoing?
It seems to me that there are many sites that use hit counting scripts and that Google would be smart enough to determine that these are in fact outgoing links and not internal ones, or am I wrong about that?
| 9:44 pm on Feb 11, 2004 (gmt 0)|
These show up as backlinks: [domain.com...]
even though Googlebot is banned from the cgi-bin. When the destination URL is more complex, like [site.com...] these will usually not show as backlinks, but do sometimes.
That doesn't really answer your question though...
| 10:43 pm on Feb 11, 2004 (gmt 0)|
Welcome to Webmaster World.
In your example...
...there is no way for anybody to figure out the link, bots or visitors, because it's stored in a script.
On the other hand, in steveb's example...
...the link is readily readable by anybody, bots and people included, and can easily be extracted.
General rule of thumb is that if you can read it a bot will also (unless you pack too many variables into the back end).
| 11:24 pm on Feb 11, 2004 (gmt 0)|
Thanks very much, but does this mean that if you use the format:
It is then considered an internal link, eventhough if Googlebot follows it, it winds up at an external site?
And if you use:
Google would then recognize the url variable and consider this an outgoing link?
So, if we changed to the second option, it would end up diluting our PR whereas the first way doesn't?
|...there is no way for anybody to figure out the link, bots or visitors, because it's stored in a script. |
I understand that you can't determine where the link leads by simply reading the code, but it seems that it would be more than possible to figure out the link (either a visitor or a bot) by simply following it. So I guess you're saying that google can't/doesn't go back after following the link to note that it was an external link? Sorry if I'm way off here...
| 1:28 am on Feb 12, 2004 (gmt 0)|
I just checked a few things, and I found some backlinks showing for links like:
I checked a few .cgi?id=123
links, and they do not show up, but perhaps it is because the variable is "id" which I know Google doesn't like. Didn't look for other .cgi redirects - perhaps someone else has an example.
| 1:58 am on Feb 12, 2004 (gmt 0)|
Rather than trying to guesse how smart GoogleBot is in deciphering "tracking links", you could just do what Google.com does to track links.
It uses "onmousedown" for anchors. The link text is perfectly standard. It's syntax looks like:
<a href=http://www.example.com/ onmousedown="return clk(1,this)..........
I think anyone who submits their link to a directory, should first check the format of the link back to them. Personally, I want a direct link, with no funny stuff. Also, when I look at my web log, I want to be able to input the exact referer url into my web browser, and see the same page my site is listed on. I get annoyed when I put in a "referer" url, and automatically get re-directed to my site.
| 4:34 am on Feb 12, 2004 (gmt 0)|
I just found that google will follow the redirection and count it,
googlebot will follow it until it get a 200 HTTP header.
| 4:24 pm on Feb 12, 2004 (gmt 0)|
Exactly! When googlebot hits www.domain.com/bounce.cgi?ID=25 , the script is going to return a 301 or 302 redirect header, thus google will definitely have no problem determining what the link is actually to.
| 5:06 pm on Feb 12, 2004 (gmt 0)|
Thanks very much. I should have thought about checking the server headers for the bounce url. After reading your post I just did check some of mine and those of other similar sites, and you're right it shows as a 302 code with the location of the destination link.
| 5:31 pm on Feb 12, 2004 (gmt 0)|
On a closely related topic does anyone know how Alltheweb/Fast SERPs get into the Google index. I've just searched for the relevant URL inurl: and there are 83,000 of them listed. My own site has 43 pointing to it. I've checked Alltheweb and it has a robots.txt file which dissallows the relevant parts of their site.
The URLs are all very long redirecting ones with a long obfuscated bit and then the referred to site URL in human readable text.
I guess that someone might be harvesting Alltheweb SERPs and made pages out of them but the URL in the index still has to be redirected through their site so Googlebot is ignoring the disallow or their syntax is misunderstood by Googlebot.
Do these redirects count or should I get rid of them?
| 10:26 pm on Feb 12, 2004 (gmt 0)|
usually googlebot will follow all cgi links if they are not blocked by a robots.txt
what steveb probably means is that even if you block your cgi-bin with a robots.txt, googlebot will still
consider www.site.com the destination for the example link:
but *while being blocked by robots.txt* a link like this:
will not show, since googlebot may not fetich this script, and thus is not able to check the Location header to see where this links to.
That being said, I don't think you should rely on this, because there can always be times that googlebot *will* follow a blocked link (if your robots.txt has just been created, etc.)
| 8:52 pm on Feb 13, 2004 (gmt 0)|
interesting issue I've been thinking about recently. anybody knows if gossamer threads link.cgi is working with googlebot? AFAIK it isn't (checked some backlings)
| 11:43 pm on Feb 14, 2004 (gmt 0)|
Okay, I'm confused now. Let's try an example. Suppose that there are three pages: a.html, b.html, and c.html:
a.html has two links on it, which we'll call B and C. We know that if B and C look like: href="b.html" and href="c.html" - then 1/2 of the page rank of a.html will be passed on to b.html and 1/2 to c.html.
The first question becomes: What if C actually looks like: "href=cgi-bin/out.cgi?ID=2" - and it redirects to c.html?
Does c.html get the benefit of having 1/2 of a.html's page rank passed on to it? Yes, or no? And, most importantly, what is the third party objectively verifiable explanation for your answer?
Next, if c.html does NOT receive any page rank from a.html due to the redirect, then:
1) Does a.html pass on ALL of it's page rank to b.html? Or,
2) Does a.html still only pass on 1/2 of its page rank to b.html, since there were actually two links on a.html, even though one isn't receiving any ranking?
And if #2 is the case, then what happens to the other 1/2 of the page rank that would've been passed on to c.html if it hadn't been linked through a redirection script?
And, as indicated above, giving an answer about what happens is only 1% of what's valuable - the other 99% is stating specifically, in a third party objectively verifiable way, what leads you to your conclusions. We all know people who are willing to tell us all that they know, but unfortunately most of the people on message boards are willing to tell us much much MORE than they actually know, which in the end isn't all that helpful.