Forum Moderators: phranque

Message Too Old, No Replies

Static URL Question

Based on an article authored by jdMorgan

         

max4

3:58 pm on Jul 21, 2009 (gmt 0)

10+ Year Member



Hello,

I recently read an excellent article by jdMorgan entitled "Changing Dynamic URLs to Static URLs". I still struggle with writing code for .htaccess; so please bear with me if my questions seem overly simple.

Something to keep in mind: I'm testing these methods out on a production server which prevents search engine spiders from crawling it's pages using robots.txt.

I will post what I have created, again based off of jdMorgan's article, and then attempt to describe my difficulties.

[fixed]
The code:
# Internal
RewriteRule ^([^/]+)/([^/]+)/?$ /profile.php?username=$1 [L]
# External
RewriteCond %{THE_REQUEST} ^[A-Za-z0-9_]{3,9}\ /profile\.php\?username=([^\ ]+)\ HTTP/
RewriteRule ^profile\.php$ http://www.example.com/%1/%2? [R=301,L][/fixed]

The first problem I encountered, and corrected, was the selective loading of my CSS documents. If I entered http://www.example.com/username into the address bar the CSS documents loaded fine but if I entered http://www.example.com/username/ into the address bar the documents refused to load. I don't know what caused this behavior but as jdMorgan indicated in his article, server-relative links and canonical links corrected this problem.

However; neither server-relative nor canonical links were able to display my images. This is the first issue which I can not resolve. Images are set-up thusly:

[fixed]
<img src="image/logo.gif">[/fixed]

I've tried

[fixed]
<img src="[b]/[/b]image/logo.gif">
<img src="[b]http://www.example.com/[/b]image/logo.gif">[/fixed]

Neither were able to display my images.

Furthermore, my intention was to only modify the profile section of my site. However; the code I supplied above seems to affect my entire site and not just profile.php. Images across my entire site no longer load. What I find strange is that my CSS documents load just fine in their page-relative form (in pages other than profile.php) which led me to attempt the following:

[fixed]
<link rel="canonical" href="http://www.example.com/" />[/fixed]

But still, no affect. Interestingly, my PHP includes were not affected by the change. Any ideas/suggestions would be much appreciated.

Thank you,
Max

jdMorgan

9:25 pm on Jul 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is a fundamental flaw here, and one that must be resolved before proceeding.

In this rule:

RewriteRule ^([^/]+)/([^/]+)/?$ /profile.php?username=$1 [L]

you are taking the URL-path "/a/b/" or "/a/b" and using "a" as the username. But "b" is ignored and discarded.

Then you come along later and try to redirect in the opposite direction with:

 RewriteCond %{THE_REQUEST} ^[A-Za-z0-9_]{3,9}\ /profile\.php\?username=([^\ ]+)\ HTTP/
RewriteRule ^profile\.php$ http://www.example.com/%1/%2? [R=301,L]

but the problem here is that you have only the username in the query string to use to build "/a", but you have no information whatsoever to use to build "/b".

So this won't work. The "transformation" of the query-string form and the directory-variable form must be a complete mirror image. Otherwise, the process is not reversible.

Also, discarding/ignoring any part of the URL-path creates an exploit vulnerability on your site, in that nothing stops me (Mean Mister Competitor) from linking to "example.com/username/this-site-is-a-fraud-and-is-run-by-a-criminal" and hundreds of other variations -- intended to cause your page to lose ranking because of duplicate-content (the same page returend for many different URL-requests).

So before proceeding to correct any coding problems, this underlying problem needs to be addressed.

Also, there are other errors in the code, but it's impossible to make any suggestions since we know neither the URL-paths you are testing with, nor the filepaths to which those URLs are intended to resolve. Both are necessary information.

Jim

max4

12:37 am on Jul 22, 2009 (gmt 0)

10+ Year Member



Thank you Jim, for the article and for your reply. I learned quite a bit from your article but I was able to resolve the problem by reading the apache mod_rewrite manual; something I should have done a long time ago.

The regular expression in the rewrite rule is checking for the specified characters in the URL and replacing them with the substitution supplied in the second portion of the rule. Knowing this, I was able to create a regular expression that matched exactly what I was trying to replace (the characters that I allow in a username). What I failed to understand earlier was that the second part (substitution) of the rewrite rule is what replaces the first part; and not the other way around.

In fact, just disregard the attempt in my first post. I was writing without knowledge but full of frustration. As far as exploit vulnerabilities are concerned; I take care of that in PHP and throw a 404 error when something unusual is called through the URL bar (whether in a GET variable or otherwise) and apache does a nice job at tossing the error for what ever I miss. "example.com/username/this-site-is-a-fraud-and-is-run-by-a-criminal" throws a 404 error.

If you have documentation regarding exploit vulnerabilities on hand I would be very interested in reading them. If it's not too much trouble, of course.

Thanks again Jim,
Max