homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

Rewrite Working, Redirect Not

 1:19 pm on Apr 30, 2009 (gmt 0)

Hey Guys, I have a friends site with some crazy CMS on that is generating horrible urls, I have managed to at least make them static but I can't get the rediect working (so the SE friendly urls work but they are not being accessed other than manually typing)

Here is what I thought was right although the last change I made caused a 500 error, go me!

Options +FollowSymLinks
RewriteEngine on
RewriteRule external-page-(.*)\.htm$ external.php?page=$1

RewriteCond %{THE_REQUEST} ^([a-z+]+[a-z+])\.htm /external\.php\?page=[a-z+]\ HTTP/
RewriteRule ^external\.php$ http://www.example.com/external-page-%1.htm? [R=301,L]

[edited by: jdMorgan at 3:53 pm (utc) on April 30, 2009]
[edit reason] example.com [/edit]



 3:18 pm on Apr 30, 2009 (gmt 0)

Please provide example of a static URL and its corresponding dynamic filepath and dynamic URL.

The likely cause of your 500 error is an invalid/extra space in your RewriteCond pattern, but the code has other problems, so we need to see both URLs and the script filepath here.



 3:29 pm on Apr 30, 2009 (gmt 0)


These are the dynamic and static, the "htm" was his request but can easily be left as php or turned into /contact_us/, whichever will work best.

Thanks (for helping again) jdMorgan

[edited by: jdMorgan at 3:54 pm (utc) on April 30, 2009]
[edit reason] example.com [/edit]


 3:51 pm on Apr 30, 2009 (gmt 0)

Options +FollowSymLinks
RewriteEngine on
# Externally redirect direct client requests for dynamic URL to corresponding static URL
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /external\.php\?page=([^&\ ]+)\ HTTP/
RewriteRule ^external\.php$ http://www.example.com/external-page-%1.htm? [R=301,L]
# Internally rewrite static URL requests to dynamic script filepath
RewriteRule ^external-page-(.+)\.htm$ external.php?page=$1 [L]

Note that the value of %{THE_REQUEST} in this case will be something like
GET /external.php?page=contact_us HTTP/1.1

This is exactly the same as the client request string that you will see in your raw server access log file.
Hopefully, that will explain the RewriteCond pattern.

Note that if any other name/value pairs precede or follow the "page=contact_us" name/value pair in the query string attached to the requested /external.php URL-path, the rule will not be invoked. If additional parameters are a possibility, then the code must be modified to allow for them and to handle them correctly.

Also, the rule will not be invoked if the "page=" value is blank.


[edited by: jdMorgan at 3:55 pm (utc) on April 30, 2009]


 12:37 am on May 1, 2009 (gmt 0)

*** Also, the rule will not be invoked if the "page=" value is blank. ***

... and that means your script needs to return "404 Not Found" otherwise this URL will return a blank CMS page with a "200 OK" status.


 7:11 am on May 1, 2009 (gmt 0)

Legend, all working thanks guys, The only bit I don't get is the & in the expression I have read this thread a few times [webmasterworld.com...] but not seen it before, what does it do?

Regarding the url not being there, is there a way to get around this (not that I can see it happening but I like to cover bases)

Just for my own knowledge more than anything if the pages did have secondary paremeters how would you mod the code, write and additional section or change the current one to something like this;

# Externally redirect direct client requests for dynamic URL to corresponding static URL
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /external\.php\?page=([^&\ ]+)/?id=([^&\ ]+)\ HTTP/
RewriteRule ^external\.php$ http://www.example.com/external-page-%1-%2.htm? [R=301,L]

RewriteRule ^external-page-(.+)-(.+)\.htm$ external.php?page=$1?id=$2 [L]


 2:51 pm on May 1, 2009 (gmt 0)

That depends on whether the secondary parameters are always present or only sometimes present, whether they are always in the same order or not, and various other factors. So it's a lot easier to address the question given specific and real requirements.

There is no "&" in the pattern, actually. What is there is two instances of "NOT an ampersand". It's mostly an efficiency thing, but also prevents mis-operation. The pattern "page=([^&\ ]+)\ HTTP/" says, "Match the word "page" followed by an equal sign, then match and capture one or more characters not equal to either an ampersand or a space, stop capturing when you find either one, then match a space followed by "HTTP", a slash, and then anything else or nothing (mod_rewrite won't care what follows that slash because there is no end-anchor on the pattern).

The purpose is to make the compare and capture of the "page=" value as efficient as possible using POSIX regular expressions.

You will see a lot of the ambiguous and greedy ".*" and ".+" patterns in regular-expressions examples on the Web. You will also see that I almost never use them, because when possible, a much-more specific pattern should be used for the sake of efficiency. There are many cases where large numbers of rules with multiple ambiguous subpatterns can contribute significantly to the requirement for an early server upgrade -- Bad pattern coding can seriously-affect server performance. So use the most-specific patterns you can.

In case that's obscure, I'm saying do not use "(.+)-(.+)\.htm$", use "([^-]+)-([^.]+)\.htm" or even something like "(([^-]+)(-[^-]+)*)-(([^.]+)(\.[[^.]+])*)\.htm" if needed to allow for multiple hyphens and periods in the requested URL.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved