Welcome to WebmasterWorld Guest from 18.104.22.168
I am kinda stumped as to how to track which links are the most popular on any given page on my site and would appreciate any advice or insight as to how to do this.
I am able to track the number of times a given resource (such as a web page) is requested from my site easily enough. I just pre-process every request running it through a tracking Perl script that writes the request and other info to a database. But these requests can come from inside my site or from outside as through a search engine. So this doesn't do me any good in terms of knowing how many times a given link on any given page is clicked relative to the other links on the same page.
I need some way to tell if a resource request (URL) is coming from inside my site or from outside. The HTTP_REFERRAL Apache environment variable is very unreliable and I do not want to depend on it.
About the only thing I could think of was to code all my site URL's and then to strip off the code into a normal URL after recording the click. So for example let's say I have a web page with a link on it. I could code the link as such...
<a href="/internal/new_page.html">Click here to go to the new page</a>
then my .htaccess file would direct to my tracking script like so...
RewriteRule ^internal/.*$ /cgi-bin/tracking.pl [L]
I would record the link from inside the tracking.pl script and then redirect to a regular URL without the "/internal".
It would seem that only those clicks that are intiated with "/internal" would be tracked (which is what I want in this case) and search engines and others would only see the URL's without the "/internal".
Any thoughts on the above or on a better way to do this?
You'd be barring spiders from /internal/ using robots.txt I assume? Otherwise you could run into duplicate content issues if the page can also be reached without the /internal prefix. Other than that, your solution seems to be perfectly okay...
Just curious though - what's your reasoning behind using mod_rewrite rather than coding your links directly through the tracking script?
Didn't think it would be a problem in terms of being flagged by the SE's for duplicate content since the use was legitimate and since I had implemented something similar a couple of months ago to allow a user to change resolution. Now that I think about it I guess search engines like Google might automatically assume that I am spamming. Good observation. I will have to alter my strategy.
I use mod_rewrite so much that I hadn't even thought of coding my links to go directly through the tracking script. I will have to evaluate that some to see if it might be workable for me. For it would certainly be faster than adding still another mod_rewrite.
I do like the way everything goes through the .htaccess file however where I can easily redirect things to different scripts and locations as the underlying names of scripts, their locations, and other physical details change as my site grows. I tend to make a lot of changes from time to time and most of my site links are virtual in the sense that they have no direct mapping to the underlying physical resource names. With mod_rewrite, Redirect, and other Apache related directives interfacing with my changes through the .htaccess file.