Forum Moderators: phranque
I want to redirect most of the files in http://www.oldsite.com/html/ to http://www.newsite.com/tutorials/html/. However, I want maybe five files in the old directory to redirect to http://www.newsite.com/tutorials/css/.
Right now I put this in oldsite's .htaccess file:
redirect /html http://www.newsite.com/tutorials/html
and this in newsite's .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.newsite.com$
RewriteRule ^tutorials\/html\/1filename\.html$ "http\:\/\/www\.newsite\.com\/tutorials\/css\/1filename\.html" [R=301,L]
...etc...
RewriteRule ^tutorials\/html\/5filename\.html$ "http\:\/\/www\.newsite\.com\/tutorials\/css\/5filename\.html" [R=301,L]
Basically: the exceptions get redirected twice, and the second redirect is written out for each filename. This works, but I'm interested in learning the smart way to do it.
The reason I'm asking here (going as far as to create this account) is that I'm hoping you can not only write the code to do what I want, but explain the syntax afterwards: a little paragraph saying what each punctuation mark does in this case.
1. I now have one .htaccess file with
RewriteCond %{REQUEST_FILENAME} ^(file1.html or file2.html or file3.html)$
What symbol should I put in place of "or"?
2. How do I access that string variable (which may be either file1.html or file2.html...) in RewriteRule?
for ex, RewriteRule oldfolder/(FILENAME) [newsite...]
We don't provide a free code-writing service here. There are only a handful of contributors here, but hundreds of question-askers. Therefore, we will be happy to answer very-specific questions and to help you get your code working.
"Punctuation marks" -- See the regular-expressions tutorial and the mod_rewrite documentation cited in our Forum Charter [webmasterworld.com] for a good start.
Having done that, here are some additional hints:
Change the redirect on your old-domain server from a mod_alias Redirect directive to a mod_rewrite RewriteRule directive. Get that working (note that one or two additional directives may be needed to 'set up' mod rewrite on that old-domain server), then add RewriteConds testing either %{REQUEST_URI} or a back-reference to the URL-path captured in the RewriteRule pattern and comparing it against a *negative* pattern to create exceptions to your rule. That is, "Rewrite /html to newsite if NOT this URL-path and NOT that URL-path" etc.
For your reference, here is a cleaned-up version of one of your rules, which you can analyze using the resources cited above:
RewriteCond %{HTTP_HOST} ^www\.newsite\.com
RewriteRule ^tutorials/html/1filename\.html$ http://www.newsite.com/tutorials/css/1filename.html [R=301,L]
I also removed the end-anchor from your hostname pattern in the RewriteCond, so that hostnames requested in FQDN format and/or with port numbers appended will also match and get redirected. For example, http://www.example.com.:80 is a perfectly-valid value for %{HTTP_HOST}.
The end goal here is that for each unique URL requested from any of your sites, the result should be either the requested content with a 200-OK or 304-Not Modified response code, or a single 301-Moved Permanently redirect to a canonical URL that will return the originally-requested content with a 200-OK or 304-Not Modified response code. For each unique piece of 'content' -- whether it be a 'page' or an image or a media file, only one URL should be usable to access that content: Any change whatsoever in any of the characters seen in your browser's address bar constitutes an entirely-different URL, and any such URL variations should always result in a 301 redirect to the single correct/canonical URL for a given resource.
Please do have a look at our Apache Forum Charter, our Apache Forum Library, and the site-search feature. These links are all at the top of this page. Oh, and...
Welcome to the lair.
Jim
2. %1 didn't seem to work for accessing one of the options, but you can achieve the same thing with
RewriteRule ^oldpath/?(.*)$ "http://www.newsite.com/tutorials/css/$1" [R=301,L]
[edited by: monie at 8:09 am (utc) on Jan. 7, 2010]
The punctuation marks link is very helpful, thanks for pointing me that way!
And thanks g1smd!
Actually, your comment reminded me: for anyone searching this on google or something who is similarly new, my general lesson-learned advice is: think differently. Do you need RewriteCond? As someone with a Java background, 1. I was imagining the structure of my code like it was object oriented and 2. I was trying to use RewriteCond like the Java if-else statement.
I got it to work as such:
RewriteCond checks if filename = x [or]
RewriteCond checks if filename = y [or]
etc...
RewriteRule rewrites html/(.*) to http://www.example.com/directory/css/$1
RewriteCond checks if filename != x
RewriteCond checks if filename != y
etc...
RewriteRule rewrites (.*) to http://www.example.com/directory/$1
However, as I read the tutorial JD pointed me too, I used my .htaccess redirect to practice the regular-expressions syntax I was learning. My files happen to have numbers at the beginning according to what lesson they are: 3link.html, 5web.html, 16padmar.html.... so later files, which covered CSS and which were moved to a different directory, had larger numbers at the beginning of the filename.
Without using RewriteCond (except to check HTTP_HOST), I could say
RewriteRule html/(1[2-7]{1}.*) http://www.example.com/css/$1
I hadn't tried something sneaky like this before because, thinking in Java-mode, my gut-instinct was, "Any character any number of times? That must take FOREVER!"
But when I actually ran the second code, using RewriteCond less made pages redirect, and non-redirected pages load, MUCH faster. (As in, before the load-time was noticeable, about 2 seconds; now pages redirect immediately.) In retrospect, when I think about the hardware, that makes sense.
Reading through the archives, I've noticed that almost every time someone asks a question about .htaccess and their code includes RewriteCond, someone (usually JD or g1smd, ironically) swoops in and points out a way that they could have achieved the same thing using only RewriteRule.
So remember: any time you use a new programming language, you have to think according to and understand the STRUCTURE of that language.
So in essence, the order of processing is:
1) RewriteRule pattern
2) RewriteCond pattern(s) and/or condition(s)
3) RewriteRule substitution (if steps 1 & 2 match successfully)
It's unlikely that getting rid of one (or even ten) RewriteConds would speed up your code noticeably, so there may have been some other underlying problem there. But it never hurts to optimize the code from the start, so you don't have to do it "under duress" at a later time...
Be aware that you've used the "if filename =" terminology rather loosely above. To minimize confusion and errors, always keep in mind that RewriteRule examines requested URL-paths, and not the filepaths that those URLs may (later) resolve to. mod_rewrite effectively works at the time when requested URL-paths are being translated into server filepaths, so we don't yet really have 'filenames' to look at in this stage of the processing. RewriteConds, when configured to check %{REQUEST_FILENAME} or %{SCRIPT_FILENAME} (which are synonyms), are actually looking at the filepath/directory-path to which the requested URL-path would resolve by default without benefit of any rewriting that *may* occur as a result of running this rule.
Anyway, it's critical to maintaining sanity that URLs and filepaths be understood to be two completely-different and distinct things: URLs are used "out there on the Web" and filepaths are only used "inside this server." And mod_rewrite's job (*part of it) is to assist in the mapping of requested URLs to server filepaths.
* mod_rewrite can also do URL-to-URL redirects, invoke reverse-proxy through-puts, set server variables or client-side cookies, etc.
Jim
"any URL-path sub-string(s) matched by the RewriteRule pattern are available as a back-references to the RewriteConds (as $1 - $9)"
ooooh, that's helpful.
"It's unlikely that getting rid of one (or even ten) RewriteConds would speed up your code noticeably, so there may have been some other underlying problem there"
I wouldn't be surprised :) Perhaps the problem was in that the RewriteRule pattern always matched (even if it didn't substitute b/c of the RewriteConds)? Or would that not slow it down noticeably on this scale either?
"Anyway, it's critical to maintaining sanity that URLs and filepaths be understood to be two completely-different and distinct things: URLs are used "out there on the Web" and filepaths are only used "inside this server." And mod_rewrite's job (*part of it) is to assist in the mapping of requested URLs to server filepaths."
I actually didn't know that before; as I said, mostly a browser-side person. Thanks for clearing it up.
I do think in my case the RewriteCond's were always being called--if the pattern was .*, it would always match, correct?
Yes, so one thought process you add to the 'requirements' phase is to ask whether you really need to match *all* URL requests, or whether a more restrictive pattern (such as only match a particular extension, or only match a certain folder, or only match requests without an extension) might be in order.