Forum Moderators: phranque
Learning from many different google sites to get an overall understanding, I am trying to grasp the following with leading slashes.
I have seen the following example:
RewriteRule ^articles/(.*)\.html$ /articles?$1
Wouldnt this cause a double slash in the url once its redirected?
lets say someone requests
www.domain.com/articles/123.html
it would then go to
www.domain.com//articles?123
I have ran across a few sites that do this.
some do it this way:
RewriteRule ^/articles/(.*)\.html$ /articles?$1
and some do it this way:
RewriteRule ^articles/(.*)\.html$ articles?$1
which one is correct? what are the advantages/disadvantages of doing it with any of the above?
[edited by: Skhan00 at 6:24 pm (utc) on July 1, 2009]
Wouldnt this cause a double slash in the url once its redirected?
No, because that's an internal rewrite not an external redirect. It is taking a URL that matches the pattern on the left, and will get the content from the internal filepath indicated by the filepath on the right. The target is an internal filepath, not a URL.
The rule needs an [L] flag to be added at the end.
Using the leading slash means the file will be looked for in the web root.
The (.*) pattern is very inefficient, needing multiple backoff-and-retry operations. Use ([^.]+) or similar instead.
some do it this way:
RewriteRule ^/articles/(.*)\.html$ /articles?$1
and some do it this way:
RewriteRule ^articles/(.*)\.html$ articles?$1
If the code is located in /directory1/.htaccess, then "/directory1/" will be stripped off by the time the RewriteRule attempts to match it with a pattern.
Jim
Jim, all my config files fore redirects and rewrites go into my httpd subfolders that get loaded by apache. (not in htaccess files)
so for my purpose here is what I am trying to accomplish, let me know if my method is good or bad.
Friendly URL:
http://www.example.com/cars/11010052000-P/8636/bmw-e46-m3.html
Unfriendly URL (actual dynamic url):
http://www.example.com/cars/control/prod/~pid=11010052000-P/~model=8636
RewriteRule ^/cars/([^.]+)/([^.]+)/([^.]+)\.html$ /cars/control/prod/~pid=$1/~model=$2 [NC,L]
[edited by: Skhan00 at 7:01 pm (utc) on July 1, 2009]
[edited by: jdMorgan at 7:22 pm (utc) on July 1, 2009]
[edit reason] example.com [/edit]
Friendly URL:
http://www.example.com/cars/11010052000-P/8636/bmw-e46-m3.htmlUnfriendly URL (actual dynamic url):
http://www.example.com/cars/control/prod/~pid=11010052000-P/~model=8636
Your negative-matches need some tweaking:
RewriteRule ^/cars/([^/]+)/([^/]+)/[^./]+\.html$ /cars/control/prod/~pid=$1/~model=$2 [NC,L]
Best practice is to rewrite *all* variable parts of the URL-path to your script, and to validate all of them against your database. If nothing can be found using the 'required' variables, then the script must return a 404. If an entry can be found, then validate the non-required URL-path elements against what's in that database entry, and if they do not match, generate a 301-Moved Permanently redirect to the corrected URL. So in this case, if the car in the <car>.html URL-path-part is not *exactly* "bmw-e46-m3", then you'd want to redirect the request. This will prevent both careless and malicious creation of bogus URLs, and the resulting duplicate-content problems.
You will also probably want your script to check that the dynamic path was requested as a result of your rewriterule, and not as a direct client request for the old dynamic URL. If a client directly requests the old unfriendly dynamic URL, then your script should generate a 301 redirect to the corresponding new friendly static URL.
The server variable %{THE_REQUEST} can be used to check the client's HTTP request line for this purpose (you may just want to pass it as a variable to your script for this purpose).
Jim
[edited by: jdMorgan at 7:23 pm (utc) on July 1, 2009]
In simple terms if there is an incoming link to /cars/11010052000-P/8636/this-product-is-over-priced junk.html and your site would return the same content, Duplicate Content. Instead, it should 301 redirect to the correct URL.
is this how it should look?
Friendly URL: http://www.example.com/cars/11010052000-P/8636/bmw-e46-m3.html
RewriteRule ^/cars/([^/]+)/([^/]+)/[^./]+\.html$ /cars/control/prod/~pid=$1/~model=$2/~name=$3 [NC,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/control/prod/? [NC]
RewriteRule ^/cars/control/prod/~pid=([^/]+)/~model=([^/]+)/~name=([^/]+)$ /cars/$1/$2/$3\.html [R=301,NC,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/control/prod/~pid=([^/]+)/~model=([^/]+)/~name=([^/\ ]+)\ HTTP/
RewriteRule ^/cars/control/prod/~pid=[^/]+/~model=[^/]+/name=[^/]+$ http://www.example.com/cars/%1/%2/%3\.html [NC,R=301,L]
I am not sure that the "name=" field will be present in your "unfriendly" URL, because you did not show it in your rule or in your unfriendly-URL example in a previous post. If it is not present, then this rule will not work. In fact, it won't be possible to do the redirect using .htaccess alone, and you will have to do it in your script -- by looking up the correct "name=" value in your database using the pid, and then doing a 301 redirect from within your script itself. However, you will have to check THE_REQUEST in your script just as shown in the rewriterule here, in order to prevent an 'infinite' loop resulting from interaction with your internal rewrite rule.
Jim
[edited by: jdMorgan at 4:07 pm (utc) on July 2, 2009]
this is what i have in my vhosts:
RewriteRule ^/cars/([^/]+)/([^/]+)/[^./]+\.html$ /cars/control/prod/~pid=$2/~model=$3/~name=$1 [NC,L]
This gives me a 404 error. I figured maybe there is something wrong with the rule, but if I change the [NC,L] to [R=301,NC,L] it does do a 301 redirect.
C:/www/beta.example.com/htdocs/cars/control/prod
The URL that its internally rewriting to, is not a folder structure but rather dynamic pages created by my web app thats running.
when I specify [R] it works fine, but without it, it does not.
What should the filepath be to point directly to your web app? Correct the substitution path in the RewriteRule accordingly, and it will work.
You've likely got another RewriteRule, Alias, or ScriptAlias directive that would map "/cars/control/prod/~pid=11010052000-P/~model=8636" to the actual script or application path, but that rule/alias is being executed *before* your rewriterule, and not after. Therefore, it won't apply to these requests. So the trick is to reduce the process from a two-step process to a single-step process by rewriting straight to the actual script or application path.
Jim
Here is the virtual host conf:
<VirtualHost *:80>
ServerAdmin admin@example.com
DocumentRoot "C:/www/beta.example.com/htdocs"
ServerName beta.example.com
ServerAlias www.beta.example.com
ErrorLog "C:/www/beta.example.com/logs/error.log"
CustomLog "C:/www/beta.example.com/logs/access.log" common
DirectoryIndex index.cfm index.htm index.html
<Directory />
Options Indexes FollowSymLinks
AllowOverride All
Order allow,deny
Allow from all
</Directory>
ProxyPreserveHost On
proxyPass / ajp://localhost:8009/
proxyPassReverse / ajp://localhost:8009/
RewriteEngine On
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ /cars/control/prod/~pid=$3/~color=$2/~name=$1 [NC,L]
</VirtualHost>
Unfriendly URL: [beta.example.com...]
In my browser, I can get this to work, the web app pics it up just fine (this is the unfriendly URL)
Now when i put in the friendly URL:
[beta.example.com...]
I get the above apache log error with the folder /cars/control/prod not existing.
if i modify the rewriterule to look like this
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ [beta.example.com...] [P,NC,L]
with the proxy parameter with the full url path, it works fine. but not sure if thats a good way or bad way to do it. I am confused with what you said on your last post (read it over and over and couldn't grasp what it meant).
also on this code here
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/control/prod/? [NC]
RewriteRule ^/cars/control/prod/~pid=([^/]+)/~color=([^/]+)/~name=([^/]+)$ [beta.example.com...] [R=301,NC,L]
if i put that in, it causes an infinite loop, how can i properly test the condition to make sure it doesn't loop so that I can get back the unfriendly 301'd into the friendly. been trying to find a good tutorial on how to test this with no luck
[edited by: Skhan00 at 6:49 pm (utc) on July 14, 2009]
with the proxy parameter with the full url path,
RewriteRule ^/rugs/([^/]+)/([^/]+)/([^/]+)\.html$ ajp://localhost:8009/cars/control/prod/~pid=$3/~color=$2/~name=$1 [NC,P]
In order to get ofbiz running on an already configured Apache server, I followed a tutorial to add
ProxyPreserveHost On
proxyPass / ajp://localhost:8009/
proxyPassReverse / ajp://localhost:8009/
Which allows me to use it on port 80, without any conflicts.
----
I tried the following:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/? [NC]
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ ajp://localhost:8009/cars/control/prod/~pid=$3/~color=$2/~name=$1 [NC,L,P]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/control/prod/? [NC]
RewriteRule ^/cars/control/prod/~pid=([^/]+)/~color=([^/]+)/~name=([^/]+)$ /cars/$3/$2/$1\.html [R=301,NC,L]
This seems to work correctly. I can go from friendly with an internal request to the unfriendly
and i can also 301 from the unfriendly to the friendly without it looping.
This method good/bad?
that worked without using the P parameter. thanks
That is _not_ a good idea, because you won't have a reverse proxy in that case.
now to solve the RewriteCond to prevent loop... I am still lost there.
Also, using [NC] in the first rule means that you can have multiple URLs resolving to the same content. This is "duplicate content" and not recommended, SEO-wise. If you really think you might get an incorrectly-cased request, then I's suggest that you detect it and 301-redirect it to the properly-cased URL using a separate rule.
Jim
as for as not looping forever, how will it not loop forever without a condition.
if there is no condition, it will go from unfriendly to friendly, then friendly to unfriendly and so on...?
jdMorgan: thanks for the update, so it should look like this?
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/? [NC]
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ ajp://localhost:8009/cars/control/prod/~pid=$3/~color=$2/~name=$1 [P]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/control/prod/? [NC]
RewriteRule ^/cars/control/prod/~pid=([^/]+)/~color=([^/]+)/~name=([^/]+)$ /cars/$3/$2/$1\.html [R=301,NC,L]
I had deployed this on a live environment inside subdomain for testing.
Session IDs are set as a hidden attribute between the server and client. We had removed it from appending to the URL for cleaner looking URLs
The problem here is that everytime the a friendly URL is triggered the old session is not found so a new session is created, this does not occur when browsing the site thru unfriendly urls.
here is the rule:
#Rewrite for cars product
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/? [NC]
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ /carsusa/control/prod/~pid=$3/~color=$2/~name=$1 [L]
When this is done, i get a 404 error. I asked my host to take a look and this was their reply
----
The rewrite rule you wrote is correctly formed, and works. However,
the problem appears to be an issue regarding redirecting into the ofbiz
container. Here is what we see in the log:
[Fri Sep 18 21:45:04 2009] [error] [client IP.ADDRESS.XX] File does not
exist: /var/www/domains/example.com/beta/htdocs/carsusa
I suspect that apache is trying to find a file by the name generated by
the rewrite rule, disregarding the JkMount. I suspect that rewrite
rules are evaluated after JkMounts are checked, and the presence of a
JkMount is not checked afterward.
----
I then changed the rule to
#Rewrite for cars product
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cars/? [NC]
RewriteRule ^/cars/([^/]+)/([^/]+)/([^/]+)\.html$ /carsusa/control/prod/~pid=$3/~color=$2/~name=$1 [L,PT]
and this worked but the issue with the sessions being lost started.
Note that your host identified the same problem that I did above: the execution order of mod_proxy and mod_rewrite.
Jim
Also, I tried to put the rewrite rules before and after the JKmount info, didn't make a difference.
Carefully examining your client-server HTTP transactions using the "Live HTTP Headers" add-on for Firefox/Mozilla browsers (or a similar tool) may prove quite revealing in this matter.
Jim
I suspect that rewrite
rules are evaluated after JkMounts are checked, and the presence of a
JkMount is not checked afterward.
Nope, translate_name is not a RUN_ALL. The first module returning OK wins, others will never see the request. With PT, mod_rewrite returns DECLINED and so mod_jk has a chance to see the request.