homepage Welcome to WebmasterWorld Guest from 54.167.185.110
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
.htaccess mod rewrite issue
Experts Needed!
Roel




msg:3411190
 5:31 am on Aug 2, 2007 (gmt 0)

I hope some of the .htaccess experts can fix this one;

I had many url's that look like this:

example.com/subdir1/abc/
example.com/subdir1/xyz/
example.com/subdir1/
example.com/subdir2/abc/
example.com/subdir2/xyz/
example.com/subdir2/
example.com/subdir2/subdir3/abc/
example.com/subdir2/subdir3/abc/
example.com/subdir2/subdir3/

That now look like this:

example.com/abc/subdir1/
example.com/xyz/subdir1/
example.com/subdir1/
example.com/abc/subdir2/
example.com/xyz/subdir2/
example.com/subdir2/
example.com/abc/subdir2/subdir3/
example.com/xyz/subdir2/subdir3/
example.com/subdir2/subdir3/

So all URL's with abc and xyz ONLY have changed abc and xyz has moved to the front. Other URL's have not changed.

How can I write a mod_rewrite so that all these are changed automatically?

[edited by: Roel at 5:31 am (utc) on Aug. 2, 2007]

 

jdMorgan




msg:3411689
 5:09 pm on Aug 2, 2007 (gmt 0)

How you do this, and whether it's even possible to do it, depends on how much the "xyz" and "subdir" parts of the URL can change -- In other words, it depends on how many different cases must be handled.

The main problem is recognizing a URL that needs to be rewritten from one that does not. Therefore, you have to pattern-match at least part of the URL based on prior knowledge of what its values can be, so that you can recognize that it needs to be rewritten.

Jim

Roel




msg:3413302
 4:01 am on Aug 4, 2007 (gmt 0)

Hi Jim

Thanks for your help.

There are around 10 possibilities for abc/xyz

let's say they are 111,222,333,444,555,666,777,888,999,000

Then how would I write this rewrite?

Thanks again!

jdMorgan




msg:3413912
 2:15 am on Aug 5, 2007 (gmt 0)

Well, it also depends on how the subdirs can change as well. These concepts of "degrees of freedom of variability", URL-classes, or URL taxonomy are a bit tough to get across. But the basis of the question lies in the capabilities of the regular-expressions-based pattern-matching used by mod_rewrite (and many modern scripting languages). Regular expressions can detect:
  • uppercase alphabetic characters [A-Z]
  • lowercase alphabetic characters [a-z]
  • mixed-case alphabetic characters [A-Za-z]
  • numerals [0-9]
  • punctuation [`~!@#$%^&*()_+\-={}\[\]\\:";'<>?,./]
  • any single character .
  • combinations of the above types [a-z0-9_]

    It can also detect quantities of those characters:

  • zero or one ?
  • zero or more *
  • one (literal character)
  • one or more +
  • any fixed number {3}
  • a minimum number {3,}
  • a maximum number {,9}
  • a range of numbers between minimum and maximum inclusive {3,9}

    These capabilities, when combined, allow you to "classify" URLs. Just as an example, the pattern:
    ^[a-z]{2,8}[0-9]-p-[0-9]{4,}[abcABC]?/$
    describes a URL-path that starts with two to eight lowercase characters, followed by a single digit 0 through 9, followed by "-p-", followed by a minimum of 4 digits 0 through 9, followed by an optional character a, b, c, A, B, or C, and ending with a trailing slash.

    Any URL-path that matches this description will be accepted -- i.e. the rule based on this pattern will be invoked. Any URL-path that does not match this pattern will be rejected, and the rule won't run.

    You must classify *all* of your URL-paths that need to be rewritten using this system. If there are too many URL-paths to consider, then you may wish to classify all of the URL-paths that *should not* be rewritten -- again using this system. If you cannot classify them this way, then the problem cannot be solved by using a generalized regular expressions pattern in mod_rewrite, and will have to be solved on a case-by-case basis, that is, one rule for each specific URL-path.

    So here's one level of classification. This is a literal substitution only -- The pattern is literal and describes only one URL-path; Only that specific URL-path will be rewritten:

    RewriteRule ^subdir1/abc/$ /abc/subdir1/ [L]

    Now here's an almost-completely-generic rule -- one that is likely to rewrite far too many URLs, giving you undesired side effects:

    RewriteRule ^([^/]+)/([^/]+)/$ $2/$1/ [L]

    Here the numeric variables $1 and $2 refer back to (they "back-reference") the first and second parenthesized subpatterns. Each of those patterns "[^/]+" will match one or more characters that are not slashes.

    So if the URL-path "subdir1/abc/" is requested, then $1 becomes "subdir1" and $2 becomes "abc". The variables are then used on the right side to re-assemble these two URL-path-parts in reverse to form the new URL-path, "abc/subdir1/"

    However, this second rule will match *any* URL-path with the form
    <non-slash-characters> / <non-slash-characters> /
    which is probably not what you want. So you'll likely need to develop a more-specific pattern or patterns, based on your knowledge of the URL-classes you need to rewrite -- both now and (if you're actively adding URLs) in the future.

    Jim

    [edited by: jdMorgan at 9:20 pm (utc) on Aug. 5, 2007]

  • Roel




    msg:3414520
     2:36 am on Aug 6, 2007 (gmt 0)

    Hi Jim

    Wow that was quite a bit. I started playing with it, but the issue for me is that I am using wordpress already. So this is my current .htacess:

    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_FILENAME}!-f
    RewriteCond %{REQUEST_FILENAME}!-d
    RewriteRule . /index.php [L]
    </IfModule>

    I tried doing the following:

    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{THE_REQUEST} ^.*/abc/.*$ [NC]
    RewriteRule ^([^/]+)/abc/$ abc/$1/ [L]
    RewriteCond %{REQUEST_FILENAME}!-f
    RewriteCond %{REQUEST_FILENAME}!-d
    RewriteRule . /index.php [L]
    </IfModule>

    But that gives me a 404 on example.com/abc/

    Any ideas what I am doing wrong here?

    jdMorgan




    msg:3415128
     7:24 pm on Aug 6, 2007 (gmt 0)


    RewriteCond %{THE_REQUEST} ^.*/abc/.*$ [NC]
    RewriteRule ^([^/]+)/abc/$ abc/$1/ [L]

    In this case, the RewriteCond is redundant.

    Also, the rule will rewrite the client-requested URL-path "<something>/abc/" to the file /abc/<something>/

    Your test URL is missing the "<something/" part, so this rule won't be applied.

    However, since the following rule should rewrite all requested URLs that don't resolve to an existing file or directory, passing them to WordPress, I don't see how you're getting a 404 unless the WordPress rewrite isn't working either, or this code is not in the right directory.

    Jim

    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Code, Content, and Presentation / Apache Web Server
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved