homepage Welcome to WebmasterWorld Guest from 54.196.168.78
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Faceted Browse: Friendly URL Nightmare, Too Many Possible Combinations
chopin2256




msg:4406854
 6:56 pm on Jan 14, 2012 (gmt 0)

I've created a faceted browse on my site in PHP to help composers browse and drill-down results. My goal is to create friendly URLS for this directory, but this is a challenge due to the nature of the functionality. For example, there are too many possible combinations. For demonstration purposes I will provide a link. I am in no sense trying to advertise my site:

www.youngcomposers.com/page/classical-music-directory/

You will see that there are 4 main categories:

Category (?field_cat=)
Genre (?field_1=)
Sub Genre (?field_2=)
Form (?field_3=)

Furthermore, there are other attributes to narrow down navigation:

Pagination Results (?page=)
Unit of Time (?field_date=)
By Comments (?field_comments=)
By Composer (?field_composer=)

I've successfully rewritten the URLS for (Category, Genre, Sub Genre and Form) capturing all the possible combinations (for example, someone may choose Genre first, then Form second, then Category third, and mod_rewrite would have to capture this combination. But now I need to paginate through the results, narrow results by unit of time, by comments, and by composer. To capture every combination seems too tedious, and there must be a better way. Here is what I have to successfully rewrite the rules for Category, Genre, Sub Genre and Form. Any insight as to how to properly write these rules would be appreciated. Thanks!

#### Combo 4
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1.*-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_2=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1.*-f3.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_3=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f2.*-f3.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_2=$2&field_3=$3&field_1=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f2.*-f1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_2=$2&field_1=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f3.*-f1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_3=$2&field_1=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f3.*-f2.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_3=$2&field_2=$3&field_1=$4 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-c1.*-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_cat=$2&field_2=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-c1.*-f3.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_cat=$2&field_3=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f2.*-f3.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_2=$2&field_3=$3&field_cat=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f2.*-c1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_2=$2&field_cat=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f3.*-c1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_3=$2&field_cat=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f3.*-f2.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_3=$2&field_2=$3&field_cat=$4 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-c1.*-f1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_cat=$2&field_1=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-c1.*-f3.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_cat=$2&field_3=$3&field_1=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f1.*-f3.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_1=$2&field_3=$3&field_cat=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f1.*-c1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_1=$2&field_cat=$3&field_3=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f3.*-c1.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_3=$2&field_cat=$3&field_1=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f3.*-f1.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_3=$2&field_1=$3&field_cat=$4 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-c1.*-f1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_cat=$2&field_1=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-c1.*-f2.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_cat=$2&field_2=$3&field_1=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f1.*-f2.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_1=$2&field_2=$3&field_cat=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f1.*-c1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_1=$2&field_cat=$3&field_2=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f2.*-c1.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_2=$2&field_cat=$3&field_1=$4 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f2.*-f1.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_2=$2&field_1=$3&field_cat=$4 [QSA]


#### Combo 3
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_2=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f2.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_2=$2&field_1=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_2=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f3.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_3=$2&field_1=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f3.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_3=$2&field_2=$3 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-c1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_cat=$2&field_2=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-c1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_cat=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f2.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_2=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_2=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f3.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_3=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f3.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_3=$2&field_2=$3 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-c1.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_cat=$2&field_1=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-c1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_cat=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f1.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_1=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_1=$2&field_3=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f3.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_3=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f3.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_3=$2&field_1=$3 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-c1.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_cat=$2&field_1=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-c1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_cat=$2&field_2=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f1.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_1=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_1=$2&field_2=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f2.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_2=$2&field_cat=$3 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f2.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_2=$2&field_1=$3 [QSA]


#### Combo 2
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_2=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_3=$2 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_cat=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_2=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f1.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_1=$1&field_3=$2 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_cat=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_1=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_2=$1&field_3=$2 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-c1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_cat=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f1).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_1=$2 [QSA]
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-f3.*-f2).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)/$ /page/classical-music-directory/?field_3=$1&field_2=$2 [QSA]


#### Combo 1
RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*-c1.*?$ [NC]
RewriteRule ^classical-music-directory/(.*)/$ /page/classical-music-directory/?field_cat=$1 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*-f1.*?$ [NC]
RewriteRule ^classical-music-directory/(.*)/$ /page/classical-music-directory/?field_1=$1 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*-f2.*?$ [NC]
RewriteRule ^classical-music-directory/(.*)/$ /page/classical-music-directory/?field_2=$1 [QSA]

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*-f3.*?$ [NC]
RewriteRule ^classical-music-directory/(.*)/$ /page/classical-music-directory/?field_3=$1 [QSA]

 

lucy24




msg:4406916
 11:09 pm on Jan 14, 2012 (gmt 0)

!

Well, you've got time to sort things out, because all those .* must be slowing your server to a crawl. Never use .* or .+ anywhere but the very end of a search string. The alternatives are

([^/]+/)

to capture exactly one directory plus its closing slash, or similarly

([^-]+) or ([^_]+) et cetera

to capture everything up to but-not-including some specified character. In your examples, the delimiters seem to be - and _.

Use example.com for the domain name; the rest can be anything you like. Apart from Forums rules, this ensures that nothing will turn into a clickable but no-longer-readable link.

I strongly suspect that your pages and pages of Rewrites can be collapsed into just a handful of rules if you put some careful thought into the design of the URL.

g1smd




msg:4406931
 12:17 am on Jan 15, 2012 (gmt 0)

.*(-f3.*-f2.*-c1).*
and
(.*)__(.*)__(.*)__(.*)

That's gotta be slowing your server by a factor of thousands.

g1smd




msg:4406943
 2:00 am on Jan 15, 2012 (gmt 0)

RewriteCond %{THE_REQUEST} ^.*classical-music-directory/.*(-c1.*-f1.*-f2.*-f3).*?$ [NC]
RewriteRule ^classical-music-directory/(.*)__(.*)__(.*)__(.*)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_2=$3&field_3=$4 [QSA]


simplifies to

RewriteRule ^classical-music-directory/([^/_-]+-c1)__([^/_-]+-f1)__([^/_-]+-f2)__([^/_-]+-f3)/$ /page/classical-music-directory/?field_cat=$1&field_1=$2&field_2=$3&field_3=$4 [QSA,L]

or similar.

chopin2256




msg:4407137
 10:23 pm on Jan 15, 2012 (gmt 0)

Thanks for the assistance, I've simplified my rules using your advice. I couldn't get [^/_-] to work, but (.*) does work fine. I don't notice any lag on my site due to this, possibly because the strings I am substituting are short anyway. For example, the distance from (.*) to .-c1 will be no more than 10-15 characters per substitution.

I've come to realize that there is most likely no way for me to get around the dozens of combination rewrite rules due to the nature of the faceted browse. I suppose this is the tradeoff of trying to write a user friendly dynamic multi-faceted directory, vs leaving the url structure alone and messy, but without any problems.

g1smd




msg:4407142
 10:43 pm on Jan 15, 2012 (gmt 0)

The (.*) pattern is greedy, promiscuous and ambiguous.

The first (.*) captures the entire URL string from there to the very end and then has to look for a hyphen "after the end". It then has to back up and retry to "find" a hyphen. After finding a hyphen it moves forward and finds that the next characters are "f3" not "c1". It now has to back up and retry to find the previous hyphen. After finding it, it moves forward and finds "f2", not "c1". It now has to back up and retry to find the previous hyphen. After finding it, it moves forward and finds "f1", not "c1". It now has to back up and retry to find the previous hyphen. Now and only now it finds "c1".

The second (.*) captures the entire URL string from there to the very end and then has to look for a hyphen "after the end". It then has to back up and retry to "find" a hyphen. After finding a hyphen it moves forward and finds that the next characters are "f3" not "f1". It now has to back up and retry to find the previous hyphen. After finding it, it moves forward and finds "f2", not "f1". It now has to back up and retry to find the previous hyphen. Now and only now it finds "f1".

This multi-step "back up and retry" exercise is repeated again for f2 and then again for f3.

This is a serious coding error, especially given there are several dozen other rules with the same major flaw.

It's not beyond reason to guess that some requests might cause your server to perform tens of thousands of "back off and retry" attempts for each incoming URL request.

The replacement rule I suggested above came with a "maybe". It will fail if there are multiple hyphens within each element beyond the ones immediately before c1, f1, f2 and f3. It will likely need adjusting.

The coding would be a bucket load easier if the c1, f1, f2 and f3 markers were at the start of each URL fragment not at the end.

Find "c1-" then capture until double underscore. Find "f1-" then capture until double underscore. Find "f2-" then capture until double underscore. Find "f3-" then capture until end.

URL design should be one of the first steps of getting a new site designed, coded and online. It seems that it's often the last step, and merely exposes database calls as URLs without any normalisation of format, or sorting of paramters or elements.

One thing your site should do is this: if there's a request for f3 f2 f1 c1 the user should be redirected to c1 f1 f2 f3. Likewise all other non-canonical versions should be redirected.

Only requests for the canonical version of the URL should be rewritten to deliver the content.

mark_roach




msg:4407293
 2:53 pm on Jan 16, 2012 (gmt 0)

I have similarly coded inefficient rules in my httpd.conf.

For example I have the following rule to 404 any request to a certain directory which has more than 4 further subdirectories

RewriteRule ^/dir1/.+/.+/.+/.+/.+$ /404 [R=404,L]

I think I can code it more efficiently as follows :

RewriteRule ^/dir1/[^/]+/[^/]+/[^/]+/[^/]+/.+ /404 [R=404,L]

(I think I should also change [^/]+ to [^/]* in order to catch consecutive slashes).

But is there a neater and more efficient alternative ?

g1smd




msg:4407304
 3:17 pm on Jan 16, 2012 (gmt 0)

In .htaccess
RewriteRule ^dir1/([^/]+/){4}. - [F]
would do it.


I don't like the [R=404] notation at all.

I prefer
RewriteRule ^dir1/([^/]+/){4}. /not-exist [L]
where
/not-exist is a path that does not exist.


Yes, changing the + to * might also be useful; here too.

lucy24




msg:4407355
 5:26 pm on Jan 16, 2012 (gmt 0)

{4,} ;)

I think I should also change [^/]+ to [^/]* in order to catch consecutive slashes

It would be nice to say that you're not obligated to code for malformed URLs-- but something elsewhere in your config file may be silently removing the duplicates. (No idea what mod, but I see it occasionally in logs where a request containing // but otherwise correct will lead to a 200.) The catch is that // can be seen as either a null directory-- two slashes with no [^/] between-- or as a superfluous slash that might come in the middle of a perfectly legitimate path.

Do those deeply-nested files really not exist, or are you just asking everyone-- including yourself-- to stay the ### out?

g1smd




msg:4407360
 5:39 pm on Jan 16, 2012 (gmt 0)

RewriteRule ^dir1/([^/]+/){4}.
will match anything after the last valid slash and cause the rule to trigger.

RewriteRule ^dir1/([^/]+/){4,}
will match only if there is something on the end AND that something also ends with a slash. It fails to match appended junk that does not end with a slash.

Using {4}. in this case is correct. :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved