Forum Moderators: phranque

Message Too Old, No Replies

why doesn't this rewrite work?

trying to rewrite requests for nonexistent files/directories to homepage

         

mang

3:01 am on Sep 2, 2007 (gmt 0)

10+ Year Member



i'm setting up a site and am using a fairly large number of sources to promote it, so i thought the easiest thing to do would be to treat any request for a nonexistent file or directory as an indicator of the source of the click-through:

for instance, a request for '/yoursite' (where i have no such file or directory on my site as "yoursite") would be rewritten as '/index.php?src=yoursite' (index.php would then store the 'src' parameter in the user's session, and perhaps eventually in the customer database), but a request (subrequest?) for '/screen.css' (which is a file that exists on my site) would simply be fetched.

the rewrite rule (and conditions) that i thought would do that for me is:


RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/(.*) /index.php?src=$1 [L]

but when i do that, i get an unstyled homepage with no images. i turned on the rewrite log, and here's what it says (in part):


init rewrite engine with requested uri /screen.css
rewrite '/screen.css' -> '/index.php?src=screen.css'
local path result: /index.php
prefixed with document_root to /path/to/docroot/index.php
go-ahead with /path/to/docroot/index.php [OK]
init rewrite engine with requested uri /images/buynow.gif
rewrite '/images/buynow.gif' -> '/index.php?src=images/buynow.gif'
local path result: /index.php
prefixed with document_root to /path/to/docroot/index.php
go-ahead with /path/to/docroot/index.php [OK]

this rewrite rule doesn't seem to work as i expect it to on either apache 1.3.37 (Unix) or 2.2.4 (Win32); can anyone help me with this?

btw: i'm using something like the following right now (which works on both platforms, but it's cumbersome, and will only get more so):


RewriteCond %{REQUEST_FILENAME} ^/yoursite [OR]
RewriteCond %{REQUEST_FILENAME} ^/hersite [OR]
RewriteCond %{REQUEST_FILENAME} ^/theirsite
RewriteRule ^/(.*) /index.php?src=$1 [L]

rocknbil

4:18 am on Sep 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NVM, you're already doing what I suggested, sorry

jdMorgan

1:14 pm on Sep 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but when i do that, i get an unstyled homepage with no images. i turned on the rewrite log, and here's what it says (in part):

Your rule is rewriting requests for images and scripts to index.php, as shown in the log. This is because it is the client (browser or robot) which resolves relative links.

Two choices:

1) Change the links for included objects on your pages to use server-relative or canonical URLs, rather than page-relative links.

-or-

2) Modify the rule so that requests for included-object URLs are not rewritten to index.php, but rather are stripped of their 'tracking' prefix directory if the requested URL does not resolve to an existing file.

In either case, you can select the included-object URLs based on various factors, such as exist/noexist, file extension, etc. Use whatever selection criteria you believe will be easiest to maintain over the long term.

Be aware that 'exists' checks are very expensive processing-wise, because they invoke a call to the OS file manager, and they should be avoided whenever possible to minimize server performance impact. For example, you might qualify your existing rule by adding another RewriteCond preceding the file exists checks so that these checks are only done if the requested URL does not end in slash or contain a period in the final path-path. You would therefore avoid doing the exists checks for most requests -- those for directories and included objects.

Jim

mang

3:14 pm on Sep 2, 2007 (gmt 0)

10+ Year Member



thanks for the help, jim, unfortunately, i'm still not getting it; i've spent the last hour banging my head against this, and i still can't get it to work.

the best regexes that i could think of to short-circuit the file and directory tests were

RewriteCond %{REQUEST_FILENAME} !\.
RewriteCond %{REQUEST_FILENAME} !/{2,}

to exclude paths with dots or paths with more than 1 slash (the other ones i tried still got hung up on the leading slash of the full path).

this combination works as expected with tracking codes, but it also turns existing directories and files into query strings that get passed to the index page:

init rewrite engine with requested uri /download/index.php
applying pattern '^/(.*)' to uri '/download/index.php'
rewrite '/download/index.php' -> '/index.php?src=download/index.php'
split uri=/index.php?src=download/index.php -> uri=/index.php, args=src=download/index.php
local path result: /index.php
prefixed with document_root to /path/to/docroot/index.php
go-ahead with /path/to/docroot/index.php [OK]

i also tried changing all the included file links to absolute links (i.e., 'http://mysiteurl/screen.css') and dropping the !\. and !/{2,} tests (i.e., just using the !-f and !-d tests), and it still rewrites existing directories as above, so the only thing i ever get is the homepage.

jdMorgan

5:29 pm on Sep 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




# if no filetype in final path-part or trailing slash
RewriteCond %{REQUEST_URI} !(\.[^/]+¦/)$
# rewrite the request to index.php
RewriteRule ^/(.*)$ /index.php?src=$1 [L]

Replace the broken pipe "¦" character in the RewriteCond with a solid pipe before use; Posting on this forum modifies the pipe characters.

This just speeds up the code in comparison to doing a file-exists and directory-exists check. You will still have to address correction of the on-page relative links, or handle them by identifying and stripping the 'tracking' prefix path -- in your example, /yoursite, /hersite, or /theirsite -- from image and css requests.

That is, if you have an image link <img src="images/logo.gif"> on your page which is linked to as /yoursite for tracking purposes, then the browser may resolve that relative image URL as either /yoursite/images/logo.gif or as /yoursiteimages/logo.gif. So you should use server-relative image links such as <img src="/images/logo.gif"> or absolute (canonical) links such as <img src="http://www.example.com/images/logo.gif"> to avoid this problem.

Jim

mang

6:44 pm on Sep 2, 2007 (gmt 0)

10+ Year Member



thanks again, jim; i appreciate the thoughtful responses.

this one works except if the directory in question (for instance, '/downloads') doesn't have a trailing slash. i tried removing the '¦/' at the end of your condition and adding the !-d condition back after it, but that didn't help, either.

maybe i should just settle for the clunky way, or use some other scheme entirely...