Forum Moderators: phranque

Message Too Old, No Replies

Can't get this htaccess rewrite to work

htaccess subdirectory

         

markchicobaby

6:53 am on Mar 1, 2008 (gmt 0)

10+ Year Member



Hi i've got a really simple requirement, not sure why it doesn't work.

I've decided to move all my stuff from /public_html/ to a subdirectory on the same server, /public_html/subdirectory/

I just want everything for
[mysite.com...]

to be served out of
[mysite.com...]
which is actually in
/public_html/subdirectoryname/

and because its on the same server, I just want to use a rewrite not a redirect.

Additionally I would like the address in the browser address bar to remain [mysite.com...] will the above achieve this?

Here is my current best effort .htaccess:

RewriteEngine on
RewriteRule ^/$ /subdirectoryname/ [NC,R,L]

Do I need to add a condition? Shouldn't it work without one?

Not sure why I'm having trouble with something so simple, thanks for any assistance!

Mark

g1smd

10:21 pm on Mar 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You asked for a rewrite, but the [R] forces a 302 redirect. The URL path available to the RewriteCond does not "see" the initial leading "/". You also need to grab the rest of the path into (.*) or similar, and re-use it in the $1 later.

I think you'll need something akin to this:

.

# Set up

RewriteEngine on

# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
RewriteRule ^subdirectoryname/(.*)$ http://www.example.com/$1 [R=301,L]

# Redirect all non-www to www and preserve folder and file path.

RewriteCond %{HTTP_HOST} ^yoursite\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite non-folder URLs (which are all www by now) to folder filepath.

RewriteCond %{THE_REQUEST} !(subdirectoryname)
RewriteRule ^(.*)$ /subdirectoryname/$1 [L]

There are a lot more steps in this than you might have first imagined. :-)

There are several more that you might later consider, including the stripping of unecessary parameters, and catering for index file filename duplicate content issues too.

g1smd

6:35 pm on Mar 2, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Typo: replace yoursite with example in the above code.

markchicobaby

1:29 am on Mar 3, 2008 (gmt 0)

10+ Year Member



Thanks, I'm working through your example.

My brain is almost, but not quite, about to explode. If you have time I'd really appreciate it if you could confirm my understanding. Sorry if this is so detailed, but I really never anticipated this level of new stuff I had to learn to get this working!

1. The Apache docs state the syntax for RewriteCond as follows:
(quoting from [httpd.apache.org...]
Syntax: RewriteCond TestString CondPattern

But your example first line
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/

Looks to my untrained eyes like its one test string, %{THE_REQUEST} followed by three cond patterns. How does this work, is it just one long condition (With a "\" preceding each space?) or three sets of space delimited conditions? So spaces are escaped?
Is the pattern attempting to isolate "/index.html" from a THE_REQUEST thingo that looks like "GET /index.html HTTP/1.1"
So for example one of these that it matches would be
RewriteCond %{THE_REQUEST} ^GET\ /shortcut.*\ HTTP/

But we need to use [A-Z]{3,9} to match between 3 and 9 capitalised characters, because we also need to match GET, POST, HEAD, PUT, etc, as per the protocols described at
[w3.org...]

That sounds right?

2. My understanding is the Condpattern is supposed to be Perl compatible, so it should have delimiters. However almost everywhere I've seen mod_rewrite rules, I've never seen delimiters. Is that because the delimiters are:
- not actually required
- usually "/" is used as a delimiter, so they're avoided because they would make the pattern look confusing because "/" appears so commonly in URLs it would need to be escaped constantly
- But as any delimiter can actually be used for a Perl Compatible regex, would this also work:
RewriteCond %{THE_REQUEST} ¦^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/¦

3. RewriteRule ^subdirectory/(.*)$ http://www.example.com/$1 [R=301,L]

^subdirectory/(.*)$

So anything at all after "subdirectory/" is stored in variable $1 and the browser is given a "moved temporarily" instruction to go to www.example.com/$1

Does (.*) match "", ie an empty string too? (As . requires a single character but * matches zero or more, which takes precedence?)

Thanks very much for you example, I literally would never have got close without this. I'll hack away at the other rules for now...

Thank you
Mark

markchicobaby

2:16 am on Mar 3, 2008 (gmt 0)

10+ Year Member



OK have tried to implement it, however it only partially works. Also a fairly simple URL breaks it. I add a non existant directory "x" to example.com and it gives 500 error (would expect a 404):

http://example.com/x

gives an internal server error, 500.

I'm on a shared host, (am modifying .htaccess) I get the feeling this is going to complicate things, as they're probably doing things in the background without my knowledge? One of the redirections is looking sick

http://example.com/subdirectory is redirecting to

http://www.example.com//home/myuserid/public_html/subdirectory/

which then gives 404

<IfModule mod_rewrite.c>
RewriteEngine on

# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/
RewriteRule ^subdirectoryname/(.*)$ http://www.example.com/$1 [R=301,L]

# Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{THE_REQUEST} !(subdirectoryname)
RewriteRule ^(.*)$ /subdirectoryname/$1 [L]

</IfModule>

jdMorgan

2:20 am on Mar 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



1) Any non-delimited space denotes the end of one argument and/or the beginning of the next. Therefore, the code conforms perfectly with the prescribed RewriteCond syntax.

2) The condpattern must be mod_rewrite compatible. Forget everything else -- mod_rewrite is a stripped-down car chassis with a powerful engine and no safety equipment; It will pass no government-required safety inspections, it just goes fast. It is totally unforgiving of driver errors; Most will be fatal. Do not feel at liberty to deviate in the smallest way from the examples given in the mod_rewrite documentation or the URL Rewriting Guide.

3) The browser (or more to the point, the search engine robot) is given a 301-Moved Permanently redirect, telling it to discard the URL including "/subdirectory" in the path, and list your page by its "short" URL.
Yes, the "." token means "any single character," and the "*" is a quantifier denoting "zero or more of the preceding character or parenthesized characters and/or patterns."

The purpose of the first rule (301 redirect) is to tell the search engines to drop the /subdirectory path, so that it won't appear "in public" in their search results. Then the third rule does an internal rewrite to add the /subdirectory path back in when needed, but only inside the server -- where this action is not visible to clients.

The Charter for this forum contains links to several resources which you may find useful. Have fun! :)

Jim

markchicobaby

3:32 am on Mar 3, 2008 (gmt 0)

10+ Year Member



Thanks for you feedback, am really close now. I've got the first two rules working, but I think because I'm on a shared server something funny is going on. Correct me if I'm wrong please...

The first two rules work, but the third one gives me an internal server error 500. Checked syntax, looks OK. I've even tried it by itself and pointed browser to http://www.example.com/ only; its still gives 500 error. Here is example of it:

# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{THE_REQUEST} !(subdirectory)
RewriteRule ^(.*)$ /subdirectory/$1 [L] << this gives 500 error

Phew, almost there... well for now...

(Note jdmorgan, [httpd.apache.org...] states
"Remember: CondPattern is a perl compatible regular expression with some additions:")

markchicobaby

4:22 am on Mar 3, 2008 (gmt 0)

10+ Year Member



OK some more information. The third rule

RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^(m.*)$ /subdir/$1 [L] < putting a character here works

so I can do
http://www.example.com/myfiles

and its redirected to the stuff in /subdir/myfiles/ perfectly

As soon as I remove the "m" and make the rule

RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^(.*)$ /subdir/$1 [L] < remove the character

http://www.example.com/myfiles gives a 500 error.

Help! Now I'm stuck. There is so little difference between
^(.*)$ and ^(m.*)$ that I've really got no idea what to do...

I've also tried

RewriteCond %{THE_REQUEST} !(subdir)
RewriteRule ^ /subdir/$1 [L]

gives the same 500 error.

markchicobaby

10:56 am on Mar 3, 2008 (gmt 0)

10+ Year Member



OK, I know why it isn't working. The last rule is causing recursion. I'll post as a new post as its basically a new problem.

g1smd

7:54 pm on Mar 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK. My code was untested, and a best guess. I'll need jd to spot the error I made. I can't see it at present.

You're right. This:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdirectoryname.*\ HTTP/

is to match THE_REQUEST like this:
"GET /index.html HTTP/1.1"

The [A-Z]{3,9} matches the GET (or whatever) and the "\" escapes a literal space in the HTTP Request, etc.

The first two rules issue a "Permanent Redirect" in the form of a 301 redirect. It's not a temporary redirect. That would be a fatal mistake.

Recursion is always a lurking problem, and the RewriteCond is vital as a match or negative match to stop that happening.

As jd said, there is very little error checking, and "Server Error 500" is something you'll regularly see while developing things.

One thing, you used the word "redirect" when describing the "rewrite". Be careful to use the right word to describe what you want. They are quite different, even if they only differ in syntax by an [R].

jdMorgan

8:34 pm on Mar 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Missing leading slash on THE_REQUEST match pattern defeats anti-looping:

RewriteCond %{THE_REQUEST} [b]!/subdir/[/b]
RewriteRule ^(m.*)$ /subdir/$1 [L]

Jim

markchicobaby

2:01 am on Mar 4, 2008 (gmt 0)

10+ Year Member



Thanks g1smd, quick question:

In this case, instead of THE_REQUEST would it be simpler to use REQUEST_URI?

THE_REQUEST needs lots of fun stuff to match the GET etc, but the part we want is the

REQUEST_URI where we don't need to do that.

THE_REQUEST:
GET /index.html HTTP/1.1

REQUEST_URI
/index.html

so the rule to match THE_REQUEST is complicated by all the stuff to get around the GET and HTTP:
# Rewrite 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir.*\ HTTP/ <<< THIS LINE
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]

is this better and equivalent?:
# Rewrite 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{REQUEST_URI} ^/subdir.*\ <<< IS IT THE SAME AS THIS?
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]

Any reason to use THE_REQUEST instead?

markchicobaby

2:37 am on Mar 4, 2008 (gmt 0)

10+ Year Member



Hi Thanks jd for your post, unfortunately the forward slash doesn't stop the recursion as we were mistakenly using THE_REQUEST instead of REQUEST_URI.

The rewrite log shows:
/public_html/] RewriteCond: input='GET / HTTP/1.1' pattern='!/subdir/' => matched
/public_html/] rewrite '' -> '/subdir/'

So it matches the rule with or without the slash. The input is 'GET / HTTP/1.1' so the original !(subdir) and the new !/subdir/ both match this.

So the problem is that we were using THE_REQUEST, but we should have used REQUEST_URI because it references the internally created new uniform resource identifier.

I found the answer at drbacchus which looks pretty good by the way, this looks like a pretty good resource, much better than most other stuff I've found as it actually explains everything in lots of detail!

Thanks to jd and g1 for their input!

Based on what i've found at that wiki site, there are still a few more things to fix up before this is working robustly...

Who knows once I get it working properly I might even add it to that wiki and that way this question won't be asked three times a day on this forum ;)

So far my working code added to htaccess to redirect to a subdirectory is (note this isn't complete, needs more work and some parts can be simplified, eg rule1 should be based on REQUEST_URI i think):

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /

# Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]

# Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]

</IfModule>

[edited by: jdMorgan at 2:48 am (utc) on Mar. 4, 2008]
[edit reason] No URLs, please. See TOS. [/edit]

markchicobaby

3:27 am on Mar 4, 2008 (gmt 0)

10+ Year Member



UPDATE: g1smd, I worked out the reason not to use REQUEST_URI for rule1; it causes recursion. For some reason I can't edit my above post.

I've found a few bad bugs in the "not working anyway" list of rules to rewrite (and redirect) to a subdirectory. Here is a version that appears to be a bit more robust (still needs more work... test test test)

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /

# Rewrite Rule 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
# Note we can't use REQUEST_URI because you get recursion. You only want to redirect the original string.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir/.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite Rule 2: Redirect all non-www to www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite Rule 3: Rewrite non-folder URLs (which are all www by now) to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]

</IfModule>

g1smd

12:14 pm on Mar 4, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Something was nagging me to use REQUEST_URI and I strenuously ignored it.

Glad you got it fixed, and next time I'll try to remember that problem.

But, as you can see, the eventual answer is a lot more complicated than the original question might lead you to believe... as you have to cater for much more than just the rewrite.

Additionally, for other sites, there may well need to be extra rules for dealing with parameters if you have those in the URLs you use, and so on.

bebopcool

11:57 pm on Apr 18, 2008 (gmt 0)

10+ Year Member



could you please help me with this code ?

i need it for installation of aroundme on a small website.

i am not used tu htaccess rewriting

could you confirm me that i need to rename every example by my domain name

and every subdir by the subdir where i want to redirect ?

sincerely yours

bebopcool

11:14 am on Apr 19, 2008 (gmt 0)

10+ Year Member



i achieev with redirection from

www.mysite/

to www.mysite/subdirr

but what should i do if i do not want any www. to appear ?

markchicobaby

1:43 pm on Apr 20, 2008 (gmt 0)

10+ Year Member



Yes, on this forum you must not post real domain names only "www.example.com". And yes subdirectory becomes your subdirectory name.

If you don't want "www" to appear, try changing the rules as follows (please note this is not tested, you will have to test it and change for your circumstances). Also try to understand what the rules are doing, you will learn from it ;) Also, I found that setting up local Apache, and reading the Rewrite Log, really helped me learn what it is doing:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /

# Rewrite Rule 1: Redirect all direct requests for URLs including the foldername to non-folder URLs at www.
# Note we can't use REQUEST_URI because you get recursion. You only want to redirect the original string.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /subdir/.*\ HTTP/
RewriteRule ^subdir/(.*)$ http://example.com/$1 [R=301,L]

# Rewrite Rule 2: Redirect all www to non-www and preserve folder and file path.
RewriteCond %{HTTP_HOST} ^www.^example\.com [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

# Rewrite Rule 3: Rewrite non-folder URLs to folder filepath.
RewriteCond %{REQUEST_URI} !^/subdir/
RewriteRule ^(.*)$ /subdir/$1 [L]

</IfModule>

markchicobaby

1:44 pm on Apr 20, 2008 (gmt 0)

10+ Year Member



Sorry line
RewriteCond %{HTTP_HOST} ^www.^example\.com [NC]

should be
RewriteCond %{HTTP_HOST} ^www.example\.com [NC]

jdMorgan

3:37 pm on Apr 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



should be
RewriteCond %{HTTP_HOST} ^www\.example\.com [NC]

:)

Jim

g1smd

6:22 pm on Apr 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That should work, as far as I can see.