Forum Moderators: phranque

Message Too Old, No Replies

Show 301 not 302...

Apache & Perl Script

         

rover

12:11 am on Mar 21, 2005 (gmt 0)

10+ Year Member



When someone clicks on the following link of my sites:

[thisdomain.com...]

It runs the following perl script:

-------------------------

#!/usr/local/bin/perl5

use CGI qw/:standard/;
$track_url = "http://www.otherdomain.com/urltracker.cgi";
$track_url .= "?id=" . param('id');
print "Location: $jump_url\n\n"

-------------------------

Basically, it is redirecting to run the urltracker.cgi script which is actually on another domain (on the same server). It tracks the clickthrough and then redirects the user to the proper web page.

Because of various constraints for this site, I need to do it this way, and it functions without problem.

But, when I checked the server headers, I see that Apache is showing 302 FOUND for the redirect headers when it redirects to the other domain. (The final response header is 200 OK after it redirects to the site).

Does anyone know if there is any way to modify the perl code above to force apache to show a 301 FOUND (instead of 302) for the redirect headers? Or would something like this not be possible?

I've been seeing a lot of the posts about page hijacking and 302 redirects, and I don't want anyone to think that I'm trying to hijack a page, or actually inadvertently do so.

sitz

1:10 am on Mar 21, 2005 (gmt 0)

10+ Year Member



I'm fairly certain that won't do you you want; a 301 is a 'permanent' redirect, which means (or at least implies) that a returning visitor to your site won't hit the tracker at all; it will to directly to the target of your redirect. I suspect this isn't what you're after.

Incidentally, if the above is /all/ your script does (and you're just parsing the logs for the tracking info) you could accomplish the same thing using mod_rewrite:


RewriteEngine on
RewriteCond %{QUERY_STRING}!^$
RewriteCond %{QUERY_STRING} ^id=([0-9]+)$
RewriteRule ^/urltracker.cgi$ http://www.otherdomain.com/urltracker.cgi?id=%1 [L,R]

(Note that this assumes that the 'id' parameter is the *only* parameter, and that its value will always be numeric. Adjust as needed.)

The advantage here is one of performance. mod_perl notwithstanding, Perl CGIs don't scale all that well, and it's fairly easy to overwhelm a server making use of them. I know this through testing, although I've seen multi-server farms melt under high-traffic to perl CGIs before. Of course, if the perl script is doing more complex things that simply issuing a redirect, the mod_rewrite solution is useless to you. =)

*All* that being said, and in answer to your original question, check 'perldoc CGI' for information on how to issue a redirect with the code of your choice.

One other thing; CGI.pm is a /monster/ of a perl script. I mean huge. As a general rule, not just for this problem, if all you're doing is parsing incoming query strings and spitting out basic html, there are far faster ways to do it in perl. Example:


time for i in $(seq 1 40); do echo foo=bar ¦ perl -lwe 'use CGI qw/:standard/; $t = new CGI(\*STDIN); $t->param("foo");'; done

real 0m1.935s
user 0m1.764s
sys 0m0.192s


That's the bare minimum code needed to parse an incoming string. 2 seconds for 40 iterations. Now see this one:

time for i in $(seq 1 40); do echo foo=bar ¦ perl -le 'read(STDIN, $t, 9); chomp($t); @f = split(/=/, $t);'; done

real 0m0.143s
user 0m0.071s
sys 0m0.095s

More than 10x faster! Of course, when splitting more complex strings, it has to be done in a couple of stages, but the overwhelming majority of that time isn't being used at runtime; it's being used when perl compiles CGI.pm into bytecode:


time for i in $(seq 1 40); do perl -e 'use CGI qw/:standard/'; done

real 0m1.394s
user 0m1.238s
sys 0m0.154s

time for i in $(seq 1 40); do perl -e ''; done

real 0m0.111s
user 0m0.056s
sys 0m0.055s

Just something to keep in mind. =)

rover

2:05 am on Mar 21, 2005 (gmt 0)

10+ Year Member



Thanks very much, I tried out the mod rewrite, and it works fine, so maybe it will be less load on the server. I really appreciate it.

With the mod-rewrite it is still returning the same headers as before:

Received redirect headers:

HTTP/1.1 302 Found
(mod-rewrite redirect to the other domain)

Response headers received:

HTTP/1.1 200 OK
(after the user has been sent to the external site)

So, it is redirecting to the tracking script on our other domain and responding with redirect headers of 302 FOUND.

Then it is taking the user to the external site, with the final response headers of 200 OK.

Maybe this isn't a problem though. I don't know if this is what leads to the 302 redirect page hijacking problems many people are encountering, or not.

It just worries me to see the 302 redirect code in there, but the final response headers are 200 OK, so maybe it's not a problem? If anyone knows if this could potentially cause a page hijacking situation for the search engines, I would be very interested to find out.

sitz

2:35 am on Mar 21, 2005 (gmt 0)

10+ Year Member



The initial site is returning a 302 because, well, you're issuing a redirect. =) (302 is the server saying "I don't have the content you want, but I was told to tell you to look over *there*", following by the server pointing helpfully to the correct URL). Once you follow the redirect, you get a 200 because the machine you were redirected to said "ah! I have the content you want! Here it is!" and returns the 200. Or am I missing something? =)

jdMorgan

3:23 am on Mar 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The mod_rewrite code was posted as an equivalent of your script code, and so it does a 302, too. Just change the [R] flag on the rule slightly:

RewriteRule ^/urltracker.cgi$ http://www.otherdomain.com/urltracker.cgi?id=%1 [L,[b]R=301[/b]]

Jim

rover

6:44 pm on Mar 21, 2005 (gmt 0)

10+ Year Member



Thanks very much. I appreciate it.

claus

7:29 pm on Mar 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It seems you did not get an answer to the question you asked. AFAIK, the lines below will do it:

------------------------- 
#!/usr/local/bin/perl5

use CGI qw/:standard/;

print $query->redirect(-location=>$jump_url,-nph=>1,-status=>301);

$track_url = "http://www.otherdomain.com/urltracker.cgi";
$track_url .= "?id=" . param('id');
-------------------------

Here, the visitor will get redirected with a 301 status code which will prevent your redirect script URL from being listed as the target page URL in the search engines ("hijack").

I have moved the call to your tracker after the redirect, so that it redirects first and counts second - this way the visitor will get redirected instantly even though your counter should hang (eg. due to a lot of traffic).

more info about custom perl headers with CGI.pm [perldoc.com]

[edited by: claus at 7:36 pm (utc) on Mar. 21, 2005]

rover

7:35 pm on Mar 21, 2005 (gmt 0)

10+ Year Member



Great, Thanks!