Forum Moderators: phranque
I've decided to move all my stuff from /public_html/ to a subdirectory on the same server, /public_html/subdirectory/
I just want everything for
[mysite.com...]
to be served out of
[mysite.com...]
which is actually in
/public_html/subdirectoryname/
and because its on the same server, I just want to use a rewrite not a redirect.
Additionally I would like the address in the browser address bar to remain [mysite.com...] will the above achieve this?
Here is my current best effort .htaccess:
RewriteEngine on
RewriteRule ^/$ /subdirectoryname/ [NC,R,L]
Do I need to add a condition? Shouldn't it work without one?
Not sure why I'm having trouble with something so simple, thanks for any assistance!
Mark
I think you'll need something akin to this:
.
# Set up
RewriteEngine on
# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
RewriteRule ^subdirectoryname/(.*)$ http://www.example.com/$1 [R=301,L]
# Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^yoursite\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{THE_REQUEST} !(subdirectoryname)
RewriteRule ^(.*)$ /subdirectoryname/$1 [L]
There are a lot more steps in this than you might have first imagined. :-)
There are several more that you might later consider, including the stripping of unecessary parameters, and catering for index file filename duplicate content issues too.
My brain is almost, but not quite, about to explode. If you have time I'd really appreciate it if you could confirm my understanding. Sorry if this is so detailed, but I really never anticipated this level of new stuff I had to learn to get this working!
1. The Apache docs state the syntax for RewriteCond as follows:
(quoting from [httpd.apache.org...]
Syntax: RewriteCond TestString CondPattern
But your example first line
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
Looks to my untrained eyes like its one test string, %{THE_REQUEST} followed by three cond patterns. How does this work, is it just one long condition (With a "\" preceding each space?) or three sets of space delimited conditions? So spaces are escaped?
Is the pattern attempting to isolate "/index.html" from a THE_REQUEST thingo that looks like "GET /index.html HTTP/1.1"
So for example one of these that it matches would be
RewriteCond %{THE_REQUEST} ^GET\ /shortcut.*\ HTTP/
But we need to use [A-Z]{3,9} to match between 3 and 9 capitalised characters, because we also need to match GET, POST, HEAD, PUT, etc, as per the protocols described at
[w3.org...]
That sounds right?
2. My understanding is the Condpattern is supposed to be Perl compatible, so it should have delimiters. However almost everywhere I've seen mod_rewrite rules, I've never seen delimiters. Is that because the delimiters are:
- not actually required
- usually "/" is used as a delimiter, so they're avoided because they would make the pattern look confusing because "/" appears so commonly in URLs it would need to be escaped constantly
- But as any delimiter can actually be used for a Perl Compatible regex, would this also work:
RewriteCond %{THE_REQUEST} ¦^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/¦
3. RewriteRule ^subdirectory/(.*)$ http://www.example.com/$1 [R=301,L]
^subdirectory/(.*)$
So anything at all after "subdirectory/" is stored in variable $1 and the browser is given a "moved temporarily" instruction to go to www.example.com/$1
Does (.*) match "", ie an empty string too? (As . requires a single character but * matches zero or more, which takes precedence?)
Thanks very much for you example, I literally would never have got close without this. I'll hack away at the other rules for now...
Thank you
Mark
http://example.com/x
gives an internal server error, 500.
I'm on a shared host, (am modifying .htaccess) I get the feeling this is going to complicate things, as they're probably doing things in the background without my knowledge? One of the redirections is looking sick
http://example.com/subdirectory is redirecting to
http://www.example.com//home/myuserid/public_html/subdirectory/
which then gives 404
<IfModule mod_rewrite.c>
RewriteEngine on
# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
RewriteRule ^subdirectoryname/(.*)$ http://www.example.com/$1 [R=301,L]
# Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{THE_REQUEST} !(subdirectoryname)
RewriteRule ^(.*)$ /subdirectoryname/$1 [L]
</IfModule>
2) The condpattern must be mod_rewrite compatible. Forget everything else -- mod_rewrite is a stripped-down car chassis with a powerful engine and no safety equipment; It will pass no government-required safety inspections, it just goes fast. It is totally unforgiving of driver errors; Most will be fatal. Do not feel at liberty to deviate in the smallest way from the examples given in the mod_rewrite documentation or the URL Rewriting Guide.
3) The browser (or more to the point, the search engine robot) is given a 301-Moved Permanently redirect, telling it to discard the URL including "/subdirectory" in the path, and list your page by its "short" URL.
Yes, the "." token means "any single character," and the "*" is a quantifier denoting "zero or more of the preceding character or parenthesized characters and/or patterns."
The purpose of the first rule (301 redirect) is to tell the search engines to drop the /subdirectory path, so that it won't appear "in public" in their search results. Then the third rule does an internal rewrite to add the /subdirectory path back in when needed, but only inside the server -- where this action is not visible to clients.
The Charter for this forum contains links to several resources which you may find useful. Have fun! :)
Jim
The first two rules work, but the third one gives me an internal server error 500. Checked syntax, looks OK. I've even tried it by itself and pointed browser to http://www.example.com/ only; its still gives 500 error. Here is example of it:
# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{THE_REQUEST} !(subdirectory)
RewriteRule ^(.*)$ /subdirectory/$1 [L] << this gives 500 error
Phew, almost there... well for now...
(Note jdmorgan, [httpd.apache.org...] states
"Remember: CondPattern is a perl compatible regular expression with some additions:")
RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^(m.*)$ /subdir/$1 [L] < putting a character here works
so I can do
http://www.example.com/myfiles
and its redirected to the stuff in /subdir/myfiles/ perfectly
As soon as I remove the "m" and make the rule
RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^(.*)$ /subdir/$1 [L] < remove the character
http://www.example.com/myfiles gives a 500 error.
Help! Now I'm stuck. There is so little difference between
^(.*)$ and ^(m.*)$ that I've really got no idea what to do...
I've also tried
RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^ /subdir/$1 [L]
gives the same 500 error.
You're right. This:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
is to match THE_REQUEST like this:
"GET /index.html HTTP/1.1"
The [A-Z]{3,9} matches the GET (or whatever) and the "\" escapes a literal space in the HTTP Request, etc.
The first two rules issue a "Permanent Redirect" in the form of a 301 redirect. It's not a temporary redirect. That would be a fatal mistake.
Recursion is always a lurking problem, and the RewriteCond is vital as a match or negative match to stop that happening.
As jd said, there is very little error checking, and "Server Error 500" is something you'll regularly see while developing things.
One thing, you used the word "redirect" when describing the "rewrite". Be careful to use the right word to describe what you want. They are quite different, even if they only differ in syntax by an [R].
In this case, instead of THE_REQUEST would it be simpler to use REQUEST_URI?
THE_REQUEST needs lots of fun stuff to match the GET etc, but the part we want is the
REQUEST_URI where we don't need to do that.
THE_REQUEST:
GET /index.html HTTP/1.1
REQUEST_URI
/index.html
so the rule to match THE_REQUEST is complicated by all the stuff to get around the GET and HTTP:
# Rewrite 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir.*\ HTTP/ <<< THIS LINE
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]
is this better and equivalent?:
# Rewrite 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{REQUEST_URI} ^/subdir.*\ <<< IS IT THE SAME AS THIS?
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]
Any reason to use THE_REQUEST instead?
The rewrite log shows:
/public_html/] RewriteCond: input='GET / HTTP/1.1' pattern='!/subdir/' => matched
/public_html/] rewrite '' -> '/subdir/'
So it matches the rule with or without the slash. The input is 'GET / HTTP/1.1' so the original !(subdir) and the new !/subdir/ both match this.
So the problem is that we were using THE_REQUEST, but we should have used REQUEST_URI because it references the internally created new uniform resource identifier.
I found the answer at drbacchus which looks pretty good by the way, this looks like a pretty good resource, much better than most other stuff I've found as it actually explains everything in lots of detail!
Thanks to jd and g1 for their input!
Based on what i've found at that wiki site, there are still a few more things to fix up before this is working robustly...
Who knows once I get it working properly I might even add it to that wiki and that way this question won't be asked three times a day on this forum ;)
So far my working code added to htaccess to redirect to a subdirectory is (note this isn't complete, needs more work and some parts can be simplified, eg rule1 should be based on REQUEST_URI i think):
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]
# Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]
</IfModule>
[edited by: jdMorgan at 2:48 am (utc) on Mar. 4, 2008]
[edit reason] No URLs, please. See TOS. [/edit]
I've found a few bad bugs in the "not working anyway" list of rules to rewrite (and redirect) to a subdirectory. Here is a version that appears to be a bit more robust (still needs more work... test test test)
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
# Rewrite Rule 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
# Note we can't use REQUEST_URI because you get recursion. You only want to redirect the original string.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir/.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]
# Rewrite Rule 2: Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
# Rewrite Rule 3: Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]
</IfModule>
Glad you got it fixed, and next time I'll try to remember that problem.
But, as you can see, the eventual answer is a lot more complicated than the original question might lead you to believe... as you have to cater for much more than just the rewrite.
Additionally, for other sites, there may well need to be extra rules for dealing with parameters if you have those in the URLs you use, and so on.
If you don't want "www" to appear, try changing the rules as follows (please note this is not tested, you will have to test it and change for your circumstances). Also try to understand what the rules are doing, you will learn from it ;) Also, I found that setting up local Apache, and reading the Rewrite Log, really helped me learn what it is doing:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
# Rewrite Rule 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
# Note we can't use REQUEST_URI because you get recursion. You only want to redirect the original string.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir/.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://example.com/$1 [R=301,L]
# Rewrite Rule 2: Redirect all www to non-www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^www.^example\.com [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]
# Rewrite Rule 3: Rewrite non-folder URLs to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]
</IfModule>