homepage Welcome to WebmasterWorld Guest from 54.242.231.109
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Issues running CMS in subfolder
Modifying 'root' .htaccess
Patrick Taylor




msg:4533903
 9:54 am on Jan 7, 2013 (gmt 0)

I am running a small flat file CMS I've built myself in PHP. The system is located in a subfolder named /cms/ but the .htaccess file in the root folder makes it run as if the content is in the root.

The .htaccess file contains:


# Rewrite / to /cms/index.php
RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

# Rewrite string to /cms/string.php
RewriteCond %{REQUEST_URI} !^/index$
RewriteRule ^([a-z0-9_\-]+)$ /cms/$1.php [NC,L]

# Rewrite cms/string.php to string (avoid duplicates)
RewriteCond %{THE_REQUEST} ^GET\ /cms/[^.]+\.php\ HTTP/
RewriteRule ^cms/([a-z0-9_\-]+)\.php$ /$1 [R=301,NC,L]


I also run the CMS in a subfolder named /laplume/, so the system is in /laplume/cms/. In this case the .htaccess file (in the main subfolder) has /laplume/ in the paths, to make it run as if the content is in /laplume/ - not the root.


# Rewrite /laplume/ to /laplume/cms/index.php
RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]

# Rewrite /laplume/string to /laplume/cms/string.php
RewriteCond %{REQUEST_URI} !^/laplume/index$
RewriteRule ^([a-z0-9_\-]+)$ /laplume/cms/$1.php [NC,L]


Both options seem to work but I am wondering if I can use just one .htaccess file for both options, eg: by using something like #RewriteBase /laplume/ in the second instance - although that does not actually work.

There is another issue... in the first option (web root) the following prevents duplicate content:


# Rewrite cms/string.php to string (avoid duplicates)
RewriteCond %{THE_REQUEST} ^GET\ /cms/[^.]+\.php\ HTTP/
RewriteRule ^cms/([a-z0-9_\-]+)\.php$ /$1 [R=301,NC,L]


In the second option (subfolder) I have tried the following:


# Rewrite laplume/cms/string.php to laplume/string (avoid duplicates)
RewriteCond %{THE_REQUEST} ^GET\ /laplume/cms/[^.]+\.php\ HTTP/
RewriteRule ^laplume/cms/([a-z0-9_\-]+)\.php$ /laplume/$1 [R=301,NC,L]


But this second option does not work and I don't understand why.

 

lucy24




msg:4534003
 4:03 pm on Jan 7, 2013 (gmt 0)

Let me answer 1/4 of the question first.

When you issue a 301 redirect, give the complete protocol and domain name. There are a couple of different reasons, but the most visible one is that this is how you canonicalize your hostname: with or without www, no trailing port numbers.

Read the fine print and you'll find that a RewriteBase is never necessary, because mod_rewrite uses it only when your rewrite target does not begin with a leading slash-- and your targets always do begin with a slash (or should).

Oh, and never use [NC] in a rule that involves rewriting without redirecting. See most recent 8,000 posts about Duplicate Content.

this second option does not work

Uh-oh, the dreaded Does Not Work. You gotta assume the people reading your post are idiots who cannot read your mind, so you have to explain exactly what "does not work" means.

:: sitting back for g1 or someone like him to deal with the hard parts ::

Patrick Taylor




msg:4534075
 7:22 pm on Jan 7, 2013 (gmt 0)

Thanks. Noted.

"Does not work" means "does nothing" in this case.

Patrick Taylor




msg:4534816
 11:12 pm on Jan 9, 2013 (gmt 0)

Update: I've now combined both options as one .htaccess:

# Forbid viewing specific files
RewriteCond %{REQUEST_URI} ^/cms/inc/(menu|list)\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/cms/inc/(menu|list)\.php$
RewriteRule .* - [F]

# ROOT FOLDER
# For home page = /
# Request = GET / HTTP/1.1
RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

# For other pages = /string
# Exclude /index
RewriteCond %{REQUEST_URI} !^/index$
# Request = GET /string HTTP/1.1
RewriteCond %{THE_REQUEST} ^GET\ /([A-Za-z0-9_\-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_\-]+)$ /cms/$1.php [L]

# /laplume/ FOLDER
# For home page = /laplume/
# Request = GET /laplume/ HTTP/1.1
RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]

# For other pages = /laplume/string
# Exclude /laplume/index
RewriteCond %{REQUEST_URI} !^/laplume/index$
# Request = GET /laplume/string HTTP/1.1
RewriteCond %{THE_REQUEST} ^GET\ /laplume/([A-Za-z0-9_\-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_\-]+)$ /laplume/cms/$1.php [L]

Probably a bit clumsy but it works for when the CMS is in root and when it's in the /laplume/ subfolder. The site structure can be either of:

/content files
/cms/scripts etc

and:

/laplume/content files
/laplume/cms/scripts etc

The one outstanding problem is the error404 document that does not show up in the second option:

ErrorDocument 404 /error404.php

lucy24




msg:4534865
 12:53 am on Jan 10, 2013 (gmt 0)

# Forbid viewing specific files
RewriteCond %{REQUEST_URI} ^/cms/inc/(menu|list)\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/cms/inc/(menu|list)\.php$
RewriteRule .* - [F]

You're already on the right track with the pipe-separation, so why not take the final step?

RewriteCond %{REQUEST_URI} ^(/laplume)?/cms/inc/(menu|list)\.php$

and then dump the Cond and put the whole thing into the Rule itself as

RewriteRule ^(/laplume)?/cms/inc/(menu|list)\.php$ - [F]

The same should work for any other rules that come in matched pairs, with and without /laplume/ -- again, in the body of the Rule:

RewriteRule ^(laplume/)?(blahblah) /$1cms/$2.php [L]

Never put something in a Condition if you can put it in the Rule itself. That's a rule ;)

In this one:
# Exclude /index
RewriteCond %{REQUEST_URI} !^/index$
# Request = GET /string HTTP/1.1
RewriteCond %{THE_REQUEST} ^GET\ /([A-Za-z0-9_\-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_\-]+)$ /cms/$1.php [L]


What's with the /index$ in that form? Surely you don't really have an extensionless page called "index", since that's only half a step less bad than "index.php" by name. If you meant "index.php", it is safer to say {THE_REQUEST} to filter out internal requests.

Gotta say I don't understand why this works in the root htaccess:
RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]


But then, I'm not totally clear whether you are really running two things concurrently-- one starting in /laplume/ and one not --or if you're just setting up htaccess so it will potentially work either way.

Does the /laplume/ version use the same set of ErrorDocuments as the plain site, or does it have its own versions? Sometimes it's easier to cop out and put some lines in a subdirectory's htaccess. (Autoindexing is probably the most common example.) After all, the server will be looking for an htaccess in every directory, whether it's there or not, so you haven't really lost any time.

Patrick Taylor




msg:4534992
 12:53 pm on Jan 10, 2013 (gmt 0)

I've simplified it down so that I can work out what is going on (or not going on):


AddDefaultCharset UTF-8

Options +FollowSymLinks
RewriteEngine on

ErrorDocument 404 /error404.php

RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/index$
RewriteCond %{THE_REQUEST} ^GET\ /([A-Za-z0-9_\-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_\-]+)$ /cms/$1.php [L]

RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/laplume/index$
RewriteCond %{THE_REQUEST} ^GET\ /laplume/([A-Za-z0-9_\-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_\-]+)$ /laplume/cms/$1.php [L]


Yes, running two things concurrently - one starting in /laplume/ and one not. Everything works for me in either (both) root and subfolder, but I do know that someone else receives a 403 error "Your PHP settings have been disabled by an H-Sphere administrator. etc etc" and it may be my .htaccess file. Except for checking everything I've done there is no way of knowing if this error is my fault or their server setup.

g1smd




msg:4535004
 1:45 pm on Jan 10, 2013 (gmt 0)

You can have one htaccess file to deal with all URL requests. In fact, this is the best way to do it.

You need to make sure there are no Redirect or RedirectMatch directives.
Use RewriteRule for ALL of your rules.

In the single htaccess file you need to put the rules in a very specific order:
- rules that block access for malicious requests
- rules that redirect folder requests
- rules that redirect other requests
- rules that rewrite folder requests
- rules that rewrite other requests.

The order is:
- redirects first, rewrites last
- most specific first, most general last.

Patrick Taylor




msg:4535038
 4:00 pm on Jan 10, 2013 (gmt 0)

Thanks.

Regarding Lucy24's comment on the extensionless page called "index" - this exists because all the content URLs are extensionless, including the home page, and I don't want the home page to be /index as such (so this is rewritten before the others). They are all physically located in the subfolder /cms/ so what the .htaccess file does is rewrite the folderless URLs as if they exist in the main folder (or root as the case may be). There's nothing actually in the main folder except .htaccess and the error404 document.

g1smd




msg:4535046
 4:34 pm on Jan 10, 2013 (gmt 0)

deleted
will post when error fixed.

g1smd




msg:4535084
 6:25 pm on Jan 10, 2013 (gmt 0)

The URL for the index pages should be
example.com/ and example.com/laplume/ - that is what you should be linking to in the navigation links of your site as href="/" and href="/laplume/".

Mod_rewrite should detect those requests as
^$ and ^laplume/ and internally rewrite as appropriate.

Each of your individual rewrite rules are over-complicated and the ruleset itself doesn't enforce canonicalisation.

Rules 31-32 and 34-35 are your rewrites coded more efficiently.

Rules 32b+35b can replace 32a+35a and then allows both of your sites to have sub-folders.

Rules 33+36 deal with requests for images, stylesheets and scripts.


Rules 21 to 25 are the extra redirects you will need to prevent direct access to scripts and enforce canonicalisation.


Options +FollowSymLinks
RewriteEngine on

ErrorDocument 404 /error404.php


# 1. Rules that block malicious requests go here.

# 11. Block access
RewriteRule ^/(laplume/)?/cms/inc/(menu|list)\.php$ - [F]


# 2. External Redirects

# 21a. Redirect for index or index.php URL request in /laplume/cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/cms/([^/]+/)*index(\.php)?\ HTTP/
RewriteRule ^laplume/cms/(([^/]+/)*)index(\.php)?$ http://www.example.com/laplume/$1? [R=301,L]

# 21b. Redirect for folder URL request in /laplume/cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/cms/([^/]+/)*\ HTTP/
RewriteRule ^laplume/cms/(([^/]+/)*)$ http://www.example.com/laplume/$1? [R=301,L]

# 21c. Redirect for page or .php file URL request in /laplume/cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/cms/([^/]+/)*[^/.]+(\.php)?\ HTTP/
RewriteRule ^laplume/cms/(([^/]+/)*[^/.]+)(\.php)?$ http://www.example.com/laplume/$1? [R=301,L]

# 22a. Redirect for index or index.php URL request in /laplume/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/([^/]+/)*index(\.php)?\ HTTP/
RewriteRule ^laplume/(([^/]+/)*)index(\.php)?$ http://www.example.com/laplume/$1? [R=301,L]

# 22b. Redirect for .php file URL request in /laplume/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/([^/]+/)*[^/.]+\.php\ HTTP/
RewriteRule ^laplume/(([^/]+/)*[^/.]+)\.php$ http://www.example.com/laplume/$1? [R=301,L]

# 23a. Redirect for index or index.php URL request in /cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cms/([^/]+/)*index(\.php)?\ HTTP/
RewriteRule ^cms/(([^/]+/)*)index(\.php)?$ http://www.example.com/$1? [R=301,L]

# 23b. Redirect for folder URL request in /cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cms/([^/]+/)*\ HTTP/
RewriteRule ^cms/(([^/]+/)*)$ http://www.example.com/$1? [R=301,L]

# 23c. Redirect for page or .php file URL request in /cms/ folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cms/([^/]+/)*[^/.]+(\.php)?\ HTTP/
RewriteRule ^cms/(([^/]+/)*[^/.]+)(\.php)?$ http://www.example.com/$1? [R=301,L]

# 24a. Redirect for index or index.php URL request in root folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index(\.php)?\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.php)?$ http://www.example.com/$1? [R=301,L]

# 24b. Redirect for .php file URL request in root folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*[^/.]+\.php\ HTTP/
RewriteRule ^(([^/]+/)*[^/.]+)\.php$ http://www.example.com/$1? [R=301,L]

# 25. non-www/www canonicalisation redirect
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


# 3. Internal Rewrites

# 31. Rewrite example.com/laplume/ URL request
RewriteRule ^laplume/$ /laplume/cms/index.php [L]

# 32a. Rewrite example.com/laplume/pages URL requests
RewriteRule ^laplume/([a-z0-9-]+)$ /laplume/cms/$1.php [L]

# 32b. Rewrite example.com/laplume/FOLDER/pages URL requests
# RewriteRule ^laplume/(([^/]+/)*[a-z0-9-]+)$ /laplume/cms/$1.php [L]

# 33. Rewrite example.com/laplume/FOLDER/ file requests
RewriteRule ^laplume/(([^/]+/)*[a-z0-9-]+\.(css|jpg|js|png))$ /laplume/cms/$1 [L]

# 34. Rewrite example.com/ URL request (root)
RewriteRule ^$ /cms/index.php [L]

# 35a. Rewrite example.com/pages URL requests
RewriteRule ^([a-z0-9-]+)$ /cms/$1.php [L]

# 35b. Rewrite example.com/FOLDER/pages URL requests
# RewriteRule ^(([^/]+/)*[a-z0-9-]+)$ /cms/$1.php [L]

# 36. Rewrite example.com/FOLDER/ file requests
RewriteRule ^(([^/]+/)*[a-z0-9-]+\.(css|jpg|js|png))$ /cms/$1 [L]



Rules 32b+35b can replace 32a+35a and then allows both of your sites to have sub-folders.

Rules 33+36 deal with requests for images, stylesheets and scripts.

The above code is untested and might have a typo somewhere.

You should restrict your URLs to lower case and digits and hyphens; i.e. not allow underscore or upper case (I already changed the patterns).

Patrick Taylor




msg:4535110
 7:45 pm on Jan 10, 2013 (gmt 0)

Many thanks. I'm going to try it (and understand it). I can see the reason for the external redirects but do the internal rewrites work independently of the redirects? Preferably the file would be used by third parties and I'd like to avoid them having to edit the file for the domain. Do the external redirects really need the domain writing in?

g1smd




msg:4535113
 7:49 pm on Jan 10, 2013 (gmt 0)

You can run the site only on rewrites, without any redirects, but you will soon end up with duplicate content issues as the "alternative ways" of accessing the content are discovered and indexed. The redirects keep searchengines on the right path.

Yes, the domain should always be in the redirects.

It is possible to have more generalised code but it is a lot less efficient.

As for understanding the code, RewriteRules are a simple idea. The difficult bit is in reading the RegEx patterns.

The RegEx pattern (on the left) matches the URL that the browser has requested and the rule target (on the right) is either a new URL (when coded as a redirect) or an internal path and file (when coded as a rewrite).

The rules in the htaccess file rewrite valid URL requests (as defined by the RegEx pattern) to target the right internal file that will produce the content. When there's a URL request that is trying to directly and improperly access a resource, a redirect tells the browser to request a different URL.

The rules that redirect, test THE_REQUEST in a preceding RewriteCond. This ensures that only the unwanted or non-canonical external requests for that resource are redirected.

You do not want to redirect requests that are the result of a previous internal rewrite, otherwise you would either expose the previously rewritten request back out on to the web as a new URL or you would end up with a redirect-rewrite infinite loop.

Browser requests: example.com/cms/pagename.php <- malicious request
External redirect received: "301 www.example.com/pagename"

Browser requests: www.example.com/pagename <- site should link to this one
Internal rewrite to: /cms/pagename.php
Browser receives the page content.

Notice the initial non-canonical external request and the internal rewrite are both for "/cms/pagename.php". It is only the fact that the rule that redirects also checks THE_REQUEST that prevents an infinite loop.

Patrick Taylor




msg:4535141
 10:55 pm on Jan 10, 2013 (gmt 0)

This gives me a 404 for some reason:

RewriteRule ^laplume/$ /laplume/cms/index.php [L]

Anyway, as I mentioned, I really need an .htaccess file that does not require editing by other users, so I may have to put up with the duplicate content issue if the alternative is them having to write their domain into the file, several times especially - and risk breaking it.

What is the reason why redirects need full URLs?

Many thanks BTW. Really helpful.

g1smd




msg:4535142
 11:13 pm on Jan 10, 2013 (gmt 0)

The redirects need the hostname stated as without it you will end up with a multiple step redirection chain for some requests.

This is where one rule fixes a problem then another rule fixes a different problem and then the non-www/www rule fixes the hostname. So having requested a URL, there's then several redirects each generating a new request before the content is finally delivered at the last requested URL. This is a problem.




RewriteRule ^laplume/$ /laplume/cms/index.php [L]

If you requested
example.com/laplume/ AND there's a file inside the server at /laplume/cms/index.php and the script is able to detect that /laplume/ was requested and it has some content to deliver for that request, then it should work.
Patrick Taylor




msg:4535144
 11:42 pm on Jan 10, 2013 (gmt 0)

RewriteRule ^laplume/$ /laplume/cms/index.php [L]

It seemed to me it should work, as there is the file there and the request was for /laplume/

I see your point about multiple URLs but a request for any of the URLs in links on the site give HTTP/1.1 200 OK. As you say, there is a duplicate of example.com/page at example.com/cms/page.php. If /cms/page.php is redirected to /page I don't see why there is more of a chain than with the full domain in the redirect, i.e. the HTTP headers should be the same, shouldn't they? One 301 and one 200 either way. Maybe I've missed the point.

g1smd




msg:4535152
 12:06 am on Jan 11, 2013 (gmt 0)

Lame rule creation could give three redirects...
example.com/cms/page.php -> example.com/page.php -> example.com/page -> www.example.com/page
before content is delivered at the right URL.

If you examine the set of rules above (21 to 25), it doesn't matter what you request:
example.com/cms/index.php
example.com/cms/index
example.com/cms/
example.com/index.php
example.com/index
example.com/
www.example.com/cms/index.php
www.example.com/cms/index
www.example.com/cms/
www.example.com/index.php
www.example.com/index
there will be a single step redirect to the canonical URL -> www.example.com/

Likewise, for any of these...
example.com/cms/page.php
example.com/cms/page
example.com/page.php
example.com/page
www.example.com/cms/page.php
www.example.com/cms/page
www.example.com/page.php
just one redirect step to get to the canonical URL -> www.example.com/page

lucy24




msg:4535164
 1:12 am on Jan 11, 2013 (gmt 0)

Whoops! Forgot to refresh page before typing reply, so now you've got the whole thing in two different people's words :)

What is the reason why redirects need full URLs?

If you don't give the full protocol and URL, mod_rewrite will use whatever it started out with:

RewriteRule ^something$ otherthing [R=301,L]

means:

user requests http://www.example.com/something
user is sent to http://www.example.com/otherthing

user requests http://example.com/something
user is sent to http://example.com/otherthing

user requests http://www.example.com:port123/something
user is sent to http://www.example.com:port123/otherthing

user requests https://www.example.com/something
user is sent to https://www.example.com/otherthing

et cetera. And then if you want to canonicalize you have to do a whole nother redirect when it could have been done in a single step.

g1smd




msg:4535173
 2:10 am on Jan 11, 2013 (gmt 0)

There's two ways of explaining it.

Thanks for covering the other one.

Patrick Taylor




msg:4535220
 8:04 am on Jan 11, 2013 (gmt 0)

g1smd and Lucy24, many thanks. I'll work on it.

g1smd




msg:4535223
 8:48 am on Jan 11, 2013 (gmt 0)

How are your PHP skills?

You know that you could write a PHP script that says. "enter your hostname..." and it then outputs the new htaccess file contents to screen leaving the user to cut and paste it, save it and upload it.

You could also have the script make the file directly, but there's a lot of security implications you'd also have to take into consideration.

Patrick Taylor




msg:4535229
 8:57 am on Jan 11, 2013 (gmt 0)

PHP. Indeed. I was thinking of that as a way to write the file from an 'install' page (at present there is no install page - the user manually edits a 'settings' file before installing). I think this is probably the solution for the redirects. The hostname can be detected without the user needing to enter it, although I agree it could be entered manually.

g1smd




msg:4535231
 9:03 am on Jan 11, 2013 (gmt 0)

The main issue is whether they are using example.com or www.example.com although you could just force www. on all users. They need to be asked, or at least confirm, before installing.

I don't know why I put rules 32a and 35a in there. I would delete them and uncomment rules 32b and 35b.

Patrick Taylor




msg:4535238
 9:35 am on Jan 11, 2013 (gmt 0)

It is an issue but in reality most users will begin by installing the CMS in a subfolder for trying it out, so the canonical is whatever they have already set up (or not bothered to set up). The best solution would be an installation page in PHP where faulty entries can be detected and corrected. Having said that, even WordPress requires a manual edit of the config file.

Patrick Taylor




msg:4539953
 6:36 pm on Jan 28, 2013 (gmt 0)

It's me again and I am still struggling with this after some time.

RewriteRule ^laplume/$ /laplume/cms/index.php [L]

g1smd: If you requested example.com/laplume/ AND there's a file inside the server at /laplume/cms/index.php and the script is able to detect that /laplume/ was requested and it has some content to deliver for that request, then it should work.


I'm afraid it just doesn't (although I cannot see why). A request for /laplume/ results in the following error: "You don't have permission to access /laplume/ on this server. Additionally, a 404 Not Found error ..." but when I go to /laplume/cms/index.php I see the page.

The other thing that baffles me...

# Redirect GET /cms/string.php HTTP/1.1 to /string (the link)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cms/([A-Za-z0-9_-]+)\.php\ HTTP/
RewriteRule ^cms/([A-Za-z0-9_-]+)\.php$ /$1 [R=301,L]

RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/index$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([A-Za-z0-9_-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_-]+)$ /cms/$1.php [L]


The first rule redirects /laplume/cms/anypage.php to the second or third rules (depending on whether the page is "index" or not). It prevents /laplume/cms/anypage.php being viewable (duplicate) and the user sees /laplume/anypage in the address bar and the content. This works in the root. However, for the subfolder /laplume/ my .htaccess has this:

# Redirect GET /laplume/cms/string.php HTTP/1.1 to /laplume/string (the link)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/cms/([A-Za-z0-9_-]+)\.php\ HTTP/
RewriteRule ^laplume/cms/([A-Za-z0-9_-]+)\.php$ /laplume/$1 [R=301,L]

RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/laplume/index$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/([A-Za-z0-9_-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_-]+)$ /laplume/cms/$1.php [L]


Nothing happens. It seems the same principle as the root example should apply, the only difference being the addition of /laplume/. It seems not. The whole thing is becoming a confusing mess.

Patrick Taylor




msg:4540030
 9:10 pm on Jan 28, 2013 (gmt 0)

To recap:

The website address is either of:
example.com/
example.com/laplume/

The actual content is at either of:
example.com/cms/
example.com/laplume/cms/

The .htaccess exists at either of:
example.com/
example.com/laplume/

The actual content files:
(1) example.com/cms/index.php
(2) example.com/cms/pages.php
(3) example.com/laplume/cms/index.php
(4) example.com/laplume/cms/pages.php

The corresponding requests:
(1) example.com/
(2) example.com/pages
(3) example.com/laplume/
(4) example.com/laplume/pages

The redirects:
example.com/index -> example.com/ (not to be treated as normal 'pages')
example.com/cms/ -> example.com/
example.com/cms/pages.php -> example.com/pages
example.com/laplume/index -> example.com/laplume/ (not to be treated as normal 'pages')
example.com/laplume/cms/ -> example.com/laplume/
example.com/laplume/cms/pages.php -> example.com/laplume/pages

The rewrites:
example.com/ -> example.com/cms/index.php
example.com/pages -> example.com/cms/pages.php
example.com/laplume/ -> example.com/laplume/cms/index.php
example.com/laplume/pages -> example.com/laplume/cms/pages.php

The admin folder must stay unaffected and is either of:
example.com/cms/admin/
example.com/laplume/cms/admin/

-----------------------------------------------------------

The current .htacess file is:

AddDefaultCharset UTF-8

Options +FollowSymLinks
RewriteEngine on

ErrorDocument 404 /error404.php

RewriteCond %{REQUEST_URI} ^/laplume/cms/inc/menu\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/cms/admin/list\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/cms/([A-Za-z0-9_-]+)\.txt$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/cms/comments/([A-Za-z0-9_-]+)\.txt$ [OR]
RewriteCond %{REQUEST_URI} ^/laplume/img/$ [OR]
RewriteCond %{REQUEST_URI} ^/cms/inc/menu\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/cms/admin/list\.php$ [OR]
RewriteCond %{REQUEST_URI} ^/cms/([A-Za-z0-9_-]+)\.txt$ [OR]
RewriteCond %{REQUEST_URI} ^/cms/comments/([A-Za-z0-9_-]+)\.txt$ [OR]
RewriteCond %{REQUEST_URI} ^/img/$
RewriteRule .* - [F]

RewriteCond %{REQUEST_URI} ^/laplume/$
RewriteRule ^$ /laplume/cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/laplume/index$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/([A-Za-z0-9_-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_-]+)$ /laplume/cms/$1.php [L]

RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/index$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([A-Za-z0-9_-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_-]+)$ /cms/$1.php [L]


The websites at example.com/ and example.com/laplume/ function correctly but there is the issue of duplicate content.

These display identical content:
example.com/ [canonical]
example.com/cms/
example.com/cms/index.php

These display identical content:
example.com/laplume/ [canonical]
example.com/laplume/cms/
example.com/laplume/cms/index.php

These display identical content:
example.com/pages [canonical]
example.com/cms/pages.php

These display identical content:
example.com/laplume/pages [canonical]
example.com/laplume/cms/pages.php

I'm not actually sure whether some of the rules that are redirects should be rewrites (and vice versa).

lucy24




msg:4540095
 5:27 am on Jan 29, 2013 (gmt 0)

# Redirect GET /cms/string.php HTTP/1.1 to /string (the link)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cms/([A-Za-z0-9_-]+)\.php\ HTTP/
RewriteRule ^cms/([A-Za-z0-9_-]+)\.php$ /$1 [R=301,L]

RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^$ /cms/index.php [L]

RewriteCond %{REQUEST_URI} !^/index$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([A-Za-z0-9_-]+)\ HTTP/
RewriteRule ^([A-Za-z0-9_-]+)$ /cms/$1.php [L]


I'm getting lost again. What happens to (1) requests for index.php by name (2) correctly worded requests for / (any directory) (3) requests for /laplume/ without the /cms/ component?

I think maybe you need to step back again. Forget RewriteRules and simply list all the forms of URL that a human could possibly ask for. For each form, what do you want their address bar to say?

Take the time to think of all possible permutations:
with and without /laplume/
with and without /cms/
with and without extension, both for index pages and others
with and without "index.xtn"
et cetera.

And then-- entirely separate-- within the smaller group of Authorized Requests that are left when all the redirects are done, where does the content really live?

Patrick Taylor




msg:4540116
 8:04 am on Jan 29, 2013 (gmt 0)

I think maybe you need to step back again. Forget RewriteRules and simply list all the forms of URL that a human could possibly ask for. For each form, what do you want their address bar to say?


I agree with that. That is why I 'recapped' above:

The actual content files (1)-(4) (where the content lives).
The corresponding requests (1)-(4) (address bar).
The redirects (to prevent duplicates).
The rewrites (address bar to content).

I think I have listed everything.

Incidentally, I know that the redirects should really give the full path on the right side but I cannot include http:/example because the domains are unknown.

Patrick Taylor




msg:4540124
 8:28 am on Jan 29, 2013 (gmt 0)

As an example of why I'm having problems.

The case where the website and .htaccess are located at example.com/laplume/

This is supposed to work:

# GET /laplume/ HTTP/1.1
RewriteRule ^laplume/$ /laplume/cms/index.php [L]


But it doesn't. Why not?

This works:

# GET /laplume/ HTTP/1.1
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /laplume/\ HTTP/
RewriteRule ^$ /laplume/cms/index.php [L]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved