Forum Moderators: phranque

Message Too Old, No Replies

Tracking newsletter readers by redirect

But they do not list in the referer log

         

jetteroheller

2:49 pm on May 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are several subdomains

car.example.com
other-sub.example.com

There is a newsletter. I want tó track my newsletter readers in a discreetly way.

So the URLs in the newsletter are coded like this:

http://www.example.com/news-car/2008/something.htm

On www.example.com is in the perl for handling error 404

my $index_news= index ( $error404::wanted, 'news-' );
if ( $index_news > 0 )
{ # 2008-05-17 newsletter system
my $news = substr ( $error404::wanted, $index_news +5 );
my ( $sub, $rest ) = split ( /\//, $news, 2 );
my $news_url = "http://$sub.example.com/$rest";
print "Status: 301 Moved Permanently\n";
print "Location: $news_url\n\n";
return;
} # 2008-05-17 newsletter system

This works fine and makes a RedirectPernamane from

http://www.example.com/news-car/2008/something.htm

to

[car.example.com...]

But it doew not show up in the referer log file from
car.example.com

Also the access log file from www.example.com is unexpected

my.isp - - [17/May/2008:07:32:44 -0700] "GET /2008/something.htm HTTP/1.1" 404 5

Why does it not list in the referer log from car.example.com?

jdMorgan

3:45 pm on May 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The whole concept is faulty. You are returning 404 errors to visitors (and more importantly, to search engines) for URLs that are (apparently) important enough that you need to track them.
Do not use a 404 handler for anything except pages that are missing for an unknown reason. To do so is to violate the HTTP protocol specification [w3.org], and you do so at your peril. Search engines expect your site to comply with that specification (and several others), and you will have major trouble if you don't.

In the mid-1990's, this "404-page as redirector" idea was invented for use on free hosting services such as GeoCities which offered no other way to support dynamic content. Now that most hosts offer much better facilities it is no longer needed, and because search engines know that, it is now also very dangerous to your search listings and rankings. This method is now (thankfully) quite dead. A 404 means a page is missing for unknown reasons, and nothing else.

HTTP referrers are NOT updated when a redirect occurs, on the (usually-good) assumption that you want to know where the traffic is coming from.

If you want to track these URLs, then do the following:

  • Delete your PERL 404 page.
  • Use mod_rewrite to rewrite the URL requests you want to track to a new PERL script.
  • Have the new PERL script write a new 'custom' logging file on the server, to log these requests directly.
  • After or while logging, have the PERL script #include the 'real' page, and output it directly to the browser -- No redirect is needed.

    You will need to observe file-locking semantics in this PERL script in order to avoid 'losing' log entries. Lock the output file, append the new entry, then unlock the output file. This will prevent multiple instances of the script from overwriting each other's entries. Do not allow the script to 'die' for any reason while the output file is locked. Pre-compute everything that you want to write to the new log file entry, so that the log file remains locked for the minimum possible amount of time.

    Jim

  • jetteroheller

    8:33 pm on May 17, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Much thanks for this very usefull answer

    I worked it now nearly completed out with mod_rewrite

    RewriteCond %{REQUEST_URI} news-car
    RewriteRule ^(.*) [car.example.org...] [R=301,L]

    But it still does not show up in the referer log file

    Also something, I did wront with it

    It should redirect

    www.example.com/news-car/message
    to
    car.example.com/message
    but instead of this
    car.example.com/news-car/message

    jdMorgan

    8:40 pm on May 17, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Delete the RewriteCond and just use:

    RewriteRule ^news-car/(.*)$ http://car.example.org/$1 [R=301,L]

    However, you missed the point of my previous post: Redirects will not be logged as coming from anything but the original referrer, regardless of whether you redirect the request.

    As described above, you need to internally rewrite these requests to a PERL script that opens a log file, logs the access, and then outputs the requested the page.

    Jim

    [edited by: jdMorgan at 8:41 pm (utc) on May 17, 2008]

    jetteroheller

    7:42 am on May 18, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    However, you missed the point of my previous post: Redirects will not be logged as coming from anything but the original referrer, regardless of whether you redirect the request.

    But at last, they show in the access log file on www.example.com

    jetteroheller

    9:51 am on May 18, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    I have subscribed to a newsletter, where this loging takes some seconds.

    First I see an URL with my tracking number and the target, the page loads some seconds later.

    So I just right now decided against so much effort, when this can cause a longer load time for the page.

    When it would have been possible with a simple redirect just costing some 1/10 seconds.