Forum Moderators: phranque
I recently read an excellent article by jdMorgan entitled "Changing Dynamic URLs to Static URLs". I still struggle with writing code for .htaccess; so please bear with me if my questions seem overly simple.
Something to keep in mind: I'm testing these methods out on a production server which prevents search engine spiders from crawling it's pages using robots.txt.
I will post what I have created, again based off of jdMorgan's article, and then attempt to describe my difficulties.
[fixed]
The code:
# Internal
RewriteRule ^([^/]+)/([^/]+)/?$ /profile.php?username=$1 [L]
# External
RewriteCond %{THE_REQUEST} ^[A-Za-z0-9_]{3,9}\ /profile\.php\?username=([^\ ]+)\ HTTP/
RewriteRule ^profile\.php$ http://www.example.com/%1/%2? [R=301,L][/fixed] The first problem I encountered, and corrected, was the selective loading of my CSS documents. If I entered http://www.example.com/username into the address bar the CSS documents loaded fine but if I entered http://www.example.com/username/ into the address bar the documents refused to load. I don't know what caused this behavior but as jdMorgan indicated in his article, server-relative links and canonical links corrected this problem.
However; neither server-relative nor canonical links were able to display my images. This is the first issue which I can not resolve. Images are set-up thusly:
[fixed]
<img src="image/logo.gif">[/fixed] I've tried
[fixed]
<img src="[b]/[/b]image/logo.gif">
<img src="[b]http://www.example.com/[/b]image/logo.gif">[/fixed] Neither were able to display my images.
Furthermore, my intention was to only modify the profile section of my site. However; the code I supplied above seems to affect my entire site and not just profile.php. Images across my entire site no longer load. What I find strange is that my CSS documents load just fine in their page-relative form (in pages other than profile.php) which led me to attempt the following:
[fixed]
<link rel="canonical" href="http://www.example.com/" />[/fixed] But still, no affect. Interestingly, my PHP includes were not affected by the change. Any ideas/suggestions would be much appreciated.
Thank you,
Max
In this rule:
RewriteRule ^([^/]+)/([^/]+)/?$ /profile.php?username=$1 [L] Then you come along later and try to redirect in the opposite direction with:
RewriteCond %{THE_REQUEST} ^[A-Za-z0-9_]{3,9}\ /profile\.php\?username=([^\ ]+)\ HTTP/
RewriteRule ^profile\.php$ http://www.example.com/%1/%2? [R=301,L] So this won't work. The "transformation" of the query-string form and the directory-variable form must be a complete mirror image. Otherwise, the process is not reversible.
Also, discarding/ignoring any part of the URL-path creates an exploit vulnerability on your site, in that nothing stops me (Mean Mister Competitor) from linking to "example.com/username/this-site-is-a-fraud-and-is-run-by-a-criminal" and hundreds of other variations -- intended to cause your page to lose ranking because of duplicate-content (the same page returend for many different URL-requests).
So before proceeding to correct any coding problems, this underlying problem needs to be addressed.
Also, there are other errors in the code, but it's impossible to make any suggestions since we know neither the URL-paths you are testing with, nor the filepaths to which those URLs are intended to resolve. Both are necessary information.
Jim
The regular expression in the rewrite rule is checking for the specified characters in the URL and replacing them with the substitution supplied in the second portion of the rule. Knowing this, I was able to create a regular expression that matched exactly what I was trying to replace (the characters that I allow in a username). What I failed to understand earlier was that the second part (substitution) of the rewrite rule is what replaces the first part; and not the other way around.
In fact, just disregard the attempt in my first post. I was writing without knowledge but full of frustration. As far as exploit vulnerabilities are concerned; I take care of that in PHP and throw a 404 error when something unusual is called through the URL bar (whether in a GET variable or otherwise) and apache does a nice job at tossing the error for what ever I miss. "example.com/username/this-site-is-a-fraud-and-is-run-by-a-criminal" throws a 404 error.
If you have documentation regarding exploit vulnerabilities on hand I would be very interested in reading them. If it's not too much trouble, of course.
Thanks again Jim,
Max