Forum Moderators: phranque

Message Too Old, No Replies

mod_rewrite and images

         

Robber

9:57 am on Mar 18, 2004 (gmt 0)

10+ Year Member



Just a quickie, I have rewritten a rewrite that will look at the path to an image and redirect it to a script so I can output a custom image. eg, a reuest for:

ht*p://www.abc.co.uk/images/blah/blah/_logo_/image.gif

gets redirected to

ht*p://www.abc.co.uk/images/display.php?_logo_=image.gif

The rewrite I am using works exactly as I wanted it to (changed to be generic):

RewriteCond %{REQUEST_URI} /images/(.*)_logo_/(.*)
RewriteRule ^(.*)$ [abc.co.uk...] [R=301]

But what I'm not sure about is should I be using a 301 redirect, I can't quite convince myself either way! I suppose I do as I am saying that the image is no longer at the original location. But then again, if someone where to paste the path to the image in their browser I would want them to see the original url, not the rewritten one.

Any help clarifying this would be much appreciated

Thanks

uncle_bob

10:17 am on Mar 18, 2004 (gmt 0)

10+ Year Member



My understanding is that mod_rewrite only does a 301 if the rewrite starts with http
If you rewrite to /example/script?oldpage=this for example, it provides the rewritten content directly with no 301 redirect.

jdMorgan

4:21 pm on Mar 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It all depends on your purpose.

If you are simply trying to use the script to serve images, but preserve a static-looking URL to make your site easy to spider or easier to maintain, then there is no need to tell the client that the resource (the image) has moved. A URL is simply an "internet address" to be used to locate a resource. It need not have anything to do with the filename on the server, and there is no requirement that you "expose" the underlying mechanisms you use to return content when that URL is requested.

At the same time, external redirects are expensive in terms of server requests and load times. With your code as written above, for each request for the static URL, your server will reply with a 301 redirect, giving the dynamic (script-based) URL for the requested image. The client (browser) will then have to re-request the image from the given URL. This means that every image request will result in two HTTP transactions, and that is very inefficient.

Using a 301 redirect will also result in search engine spiders dropping the static URL and listing the dynamic URL instead. This won't be a concern if you don't allow image robots to index your images, but it can be a disaster in cases where a 301 redirect is done on an HTML-type page which is actually script-generated.

So, to summarize, use a 301 redirect only when a resource has moved and you want to tell search engines and other user-agents to drop the old URL and start using the new URL. Use a 302 redirect when the resource has moved but is expected to return later, and you don't want user-agents to drop the old URL. In cases where you simply wish to use a script as part of the process of serving requested resources, use an internal rewrite, not a redirect. Something like:


RewriteRule ^images[^_]*_logo_/(.*)$ /images/display.php?_logo_=$1 [L]

Jim

Robber

5:11 pm on Mar 18, 2004 (gmt 0)

10+ Year Member



Cheers Jim,

Thats crystal now, makes a lot of sense.

That also explains how you would have dynamic pages that you want to make look static, eg amazon and indeed WW - they just parse the requested URL and internally rewrite it, since its an internal rewrite the client doesn't know anything has been rewritten and hence there is no redirecting of the browser.

Is that the correct interpretation?

I'll have these rewrites licked in no time now!

Robber

5:38 pm on Mar 18, 2004 (gmt 0)

10+ Year Member



One last think then, I tweaked my rules as I noticed you had done it on a single line (no RewriteCond):

RewriteRule ^images[^_]*_logo_/(.*)$ /images/display.php?_logo_=$1

But to get it to work I had to change it to:

RewriteRule images[^_]*_logo_/(.*)$ /images/display.php?_logo_=$1

ie take out the ^, would I be right in thinking that I needed to take it out as I am putting my rewrite rules in httpd.conf rather than .htaccess, in .htaccess the rewrite pattern starts with the directory where .htaccess resides so we use ^ to signify this, in http.conf it starts with the ht*p.... so we dont use ^.

Just trying to make sure I know what I am doing!

jdMorgan

6:02 pm on Mar 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"^" is a regular-expressions start anchor. If you have your rules in httpd.conf, and the full local path starts with "/images", then you'd use:

RewriteRule ^/images[^_]*_logo_/(.*)$ /images/display.php?_logo_=$1

But if the images subdirectory is below some other directory, then you'd use:

RewriteRule ^/some_other_directory/images[^_]*_logo_/(.*)$ /images/display.php?_logo_=$1

This code is the same as that for use in .htaccess -- The only difference is that the leading slash is stripped off URLs as seen by RewriteRule in an .htaccess context. However, the leading slash is not stripped off the %{REQUEST_URI} variable in either context, and this often leads to confusion. Declaring a RewriteBase of "/" is a work-around if you want to make the code more "portable" between httpd.conf and .htaccess.

It is good to use start and end anchors where possible, because they can speed up regular-expressions parsing significantly. The same can be said for avoiding the use of ".*" in the middle of a regex pattern; That's why I replaced it with "[^_]*" above -- It simplifies the parser's job of matching the pattern because it is more specific.

Ref:
[etext.lib.virginia.edu...]
[httpd.apache.org...]

Jim

Robber

7:12 pm on Mar 18, 2004 (gmt 0)

10+ Year Member



Cheers Jim, now I gotcha!

I was happy with what the ^ anchor was doing, it was the leading slash that was getting me - one for the notebook!

All working great now

Thanks again