Forum Moderators: phranque

Message Too Old, No Replies

rewrite rule for all?

rewriterule clean pretty url

         

wright67uk

6:40 pm on Aug 20, 2011 (gmt 0)

10+ Year Member


I have about 100 pages, which are all in the same format as the three URLs in my .htaccess file as below eg. COURIER\.php?subtype=COURIER instead of making 100 RewriteRules is it possible to make one RewriteRule that all of my URLs can use?

<Files ~ "^\.(htaccess|htpasswd)$">
allow from all </Files>

AddHandler x-httpd-php5 .php AddHandler x-httpd-php .php4

Options +Indexes +FollowSymlinks -MultiViews

RewriteEngine on RewriteBase /

RewriteCond %{HTTP_HOST} ^(mysite\.co\.uk)(:80)? [NC]
RewriteRule ^(.*) http://www.mysite.co.uk/$1 [R=301]

RewriteRule ^courier\.*$ COURIER\.php? subtype=COURIER [NC,L,N]
RewriteRule ^valeting\.*$ VALETING\.php?subtype=VALETING [NC,L,N]
RewriteRule ^sales\.*$ SALES\.php? subtype=SALES [NC,L,N]

order deny,allow

lucy24

10:04 pm on Aug 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The concept is right. The execution is, well, uhm...

Including [NC] in your RewriteRules is just begging for Duplicate Content problems. Save it for Redirects. Are you absolutely, definitely, certainly, 100% positive that you want to use UPPER-CASE anything?

Two of the three Rules have an extraneous space that will make mincemeat of the whole rewrite. Literal spaces anywhere in an .htaccess Regular Expression have to be escaped, because the space itself has meaning.

RewriteRule ^valeting\.*$ VALETING\.php?subtype=VALETING [NC,L,N]


[L] and [N] are mutually exclusive. It will probably help if you explain in Engish what you want the three grouped Rules to do. Flags only apply when the rule has actually executed; you don't need a flag to say "If this rule doesn't apply, continue looking for other rules".

\.*$ is almost certainly not what you mean. It says "there may or may not be some literal periods-- but no other characters-- between 'valeting' and the end of the request". You may have meant .*? (dot meaning "any character") but since you are not capturing it, you do not need anything after the word "valeting".

As written, the rule says "take any request beginning with 'valeting' [assuming for the sake of discussion that you did mean .* rather than \.*] and rewrite it as "VALETING.php", replacing the existing query string with "subtype=VALETING".

Since each of the three rules has a beginning anchor, they are mutually exclusive. Without the beginning anchor, you would have a horrendous mess, probably ending only when your server puts its infinite-loop foot down.

wright67uk

7:00 am on Aug 21, 2011 (gmt 0)

10+ Year Member



Thankyou for the reply, and apologies for the bad post.
I was not supposed to display spaces in my code, and I have modified some of my file, re your advice.

Some of the files im hosting are in Caps, following a migration from windows to Linux. ..slowly changing them to lowercase.

In English...

Im trying to change my Rewrite rules from;

RewriteRule ^courier.*$ COURIER\.php?subtype=courier [L]

to
somthing that will work with any of my URLs such as somthing like;
RewriteRule ^(.*).*$ mysite.com/$1.php?subtype=$1 [L]

Id like to capture the page name typed into the browser and then use it in my rewrite rule where in the example above ive written $1

This is so that one rule will work for all of my pages.

Im not really very good at explaining, so I hope this makes a little more sense.

Thankyou for your help so far, I do really appreciate it.

Here is my .htaccess;


<Files ~ "^\.(htaccess|htpasswd)$">
allow from all
</Files>

AddHandler x-httpd-php5 .php
AddHandler x-httpd-php .php4

Options +Indexes +FollowSymlinks -MultiViews

RewriteEngine on
RewriteBase /

RewriteCond %{HTTP_HOST} ^(mysite\.com)(:80)? [NC]
RewriteRule ^(.*) [mysite.com...] [R=301]

RewriteRule ^courier.*$ COURIER\.php?subtype=courier [L]
RewriteRule ^valeting.*$ VALETING\.php?subtype=valeting [L]
RewriteRule ^sales.*$ SALES\.php?subtype=sales [L]

order deny,allow

lucy24

7:35 am on Aug 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Im trying to change my Rewrite rules from;

RewriteRule ^courier.*$ COURIER\.php?subtype=courier [L]

to
somthing that will work with any of my URLs such as somthing like;
RewriteRule ^(.*).*$ mysite.com/$1.php?subtype=$1 [L]

Ouch, ouch. You cannot say (.*).* no way, no how. Since the asterisk means "none or more", the captured part (.*) will grab everything, leaving nothing for the leftover .* And since that part isn't being captured, it doesn't need to be there anyway.

Are COURIER, VALETING etc names of pages? Are they all at the top level of the domain? If they are in directories, the rule gets trickier. In its simplest form, the pattern would be

RewriteRule ^([^.]+)\.html? $1\.php?subtype=$1 [L]

(or \.php or whatever extension you use)

but note that Regular Expressions can't change capitalization. Whatever is captured is what you've got. If they're all going to a php script, you can deal with the capitalization there.

If the pages are potentially inside of directories, you can't use the ^ anchor. But for a rewrite you don't need it. For a redirect you do, so you can capture any subdirectories and include them in the full url.

Note also that this rule will replace any existing query string. If there is other stuff that you need to keep, include the flag [QSA] for Query String Append.

The harder part is setting conditions to make sure you're not rewriting requests that are already in the form you want. If you start out with no query string it is very easy because you only have to say

RewriteCond %{QUERY_STRING} !.

meaning "nothing of any kind"

or possibly

RewriteCond %{QUERY_STRING} =""

(Apache says so, but I've never personally tried it).

wright67uk

8:13 am on Aug 21, 2011 (gmt 0)

10+ Year Member



Thankyou, that explains loads.

I went for;

RewriteRule ^([^.]+)\.*$ $1\.php?subtype=$1 [L]

which after testing works wonders.
Now im going to change the php, and mysql cases.

My only concern would be if for example I have a page

mysite.com/valeting
and someone types in
mysite.com/Valeting
that my site would break?

g1smd

8:26 am on Aug 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The pattern ending in
\.*
matches requests for:

-
example.com/something

-
example.com/something.

-
example.com/something..

-
example.com/something...

-
example.com/something....

-
example.com/something..............................


i.e. ending in a literal period (literally, actual dots in the actual URL request) zero or more times, as explained above.

The request to explain what you want to do "in plain English" meant just that: "when user requests 'this' URL, the server should get content from 'this' place inside the server, and when user requests 'that' URL the server should do...".

I still don't know what the code needs to do. Mod_rewrite should be 90% defining what it needs to do "in plain English", and 10% coding.

The approach most people take, as here, is: 1% thinking and defining, 40% coding by guessing, 59% trying to get the code to "work", where "work" has been poorly defined.

In every such case I have seen here in the last 9 years, although the end result code "appears" to work, it actually has serious built-in, actually "designed in by accident", flaws that have serious negative indexing and ranking side-effects.

So, back to the beginning. What is the code supposed to "do", in "Plain English" and "No Code"?

lucy24

8:45 am on Aug 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



my site would break?


Well, "break" would be putting it strongly, but they'd be going to the nonexistent page "Valeting.php". The second part, the query string, is not a problem, because you can send everything to a php subroutine to regularize the capitalization, but a page is a page. It's either there or it isn't.

If you don't want to settle for a custom 404 page pointing human visitors to some likely destinations, you might be able to do something like

%{REQUEST_URI} !-f
%{REQUEST_URI} -f [NC]

and then reroute all requests of this form to another php page where capitalization will be fixed before sending them along to the right place. But don't take my unsupported word for it.

wright67uk

9:44 am on Aug 21, 2011 (gmt 0)

10+ Year Member



g1smd;

I was trying to say that if anybody should type into their browser;

www.mysite.com/valeting or
www.mysite.com/VaLeTing or
www.mysite.com/VaLeting.php or
mysite.com/VALeting.html

that they would be taken to the following location;
[mysite.com...]

The case in which the user types into his or her browser should be ignored and the file extension wether the user chooses to use one or not gets ignored

however as I have over 100 pages in the same format as the one above
(www.mysite.com/PAGENAME.php?subtype=PAGENAME)
I would like to have a universal rule that would effect every page.

lucy24;

sounds like a good option,
"...but a page is a page. It's either there or it isn't"
in the case of Valeting.php it wouldn't be there.
Im aiming towards a memorable URL.
I cant expect my users to remember the case of a memorable URL aswell?
So where you say I could send everything to a php subroutine, this definately be an option to consider.

g1smd

4:34 pm on Aug 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you allow multiple URLs to deliver the exact same content (through a RewriteRule configured to deliver an internal rewrite) then you are creating a Duplicate Content problem.

What you should actually do is redirect incorrectly cased URL requests (with a RewriteRule configured to deliver a redirect) to the correct URL, and then, and only then, deliver the content. You should also redirect lower-case .php requests to the correct extensionless version.

This is a multiple step problem. The first step is to ensure that links within your site link only to the lower-case version of the URL and that .php does not appear in those links.

Next you install a redirect script, a few lines of PHP, that sends a 301 redirect to the correct URL when it is invoked. This script performs the "to lower case" function and the removal of the .php of the requested URL and sends both a 301 HEADER and a location HEADER with the new URL in it.

You invoke that PHP script using a RewriteRule that detects if THE_REQUEST contains any upper-case characters in the path part of the URL. Those upper-case requests (both with and without .php in the request) are then internally rewritten to that PHP script.

You also install a redirect (using another RewriteRule) such that if fully lower-case external URL requests are received with a .php extension they are 301 redirected to the correct URL without the .php extension included.

Finally you have the normal non-www to www 301 canonicalisation rule in another RewriteRule set.

That's the redirects dealt with.

The last part of the puzzle is to internally rewrite lower-case extensionless requests to the correct server-internal location to actually deliver the content.

Broken down into simply defined problems to solve, each part becomes simple to code and test. The whole thing is probably 12 lines of mod_rewrite code and 20 lines of PHP code.


Be sure you fully understand the difference between a redirect and a rewrite (even though both use the RewriteRule syntax). They are very different things, even though the coding for each is only different by a few characters.