Forum Moderators: phranque

Message Too Old, No Replies

htaccess problem

with many back-references

         

dqiria

3:15 pm on Sep 21, 2008 (gmt 0)

10+ Year Member



Hello, I have such code:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&make=([^&]+)&min=([^&]+)&max=([^&]+)&min_year=([^&]+)&max_year=([^&]+)&garbeni=([^&]+)&lot=&s_cond=([^&]+)&out_rate=([^&]+)&color=([^&]+)&search_adv=%E1%83%AB%E1%83%94%E1%83%91%E1%83%9C%E1%83%90\ HTTP/
RewriteRule ^index\.php$ http://example.com/advanced_search/%1-%2-price-%3-%4-year-%5-%6-garbeni-%7-salon-%8-rate-%9-color-%10.html? [R=301,L]

problem is at "%10". It reads this as %1 and 0 and I can't get it worked with this reason.
Any suggestions?

[edited by: jdMorgan at 4:30 pm (utc) on Sep. 21, 2008]
[edit reason] example.com [/edit]

jdMorgan

4:30 pm on Sep 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is a documented limit of 9 RewriteRule and RewriteCond back-references in mod_rewrite ($1-$9 and %1-%9), so you cannot use "%10". You will have to break this rewrite into two steps to make it work.

And this brings in a further problem, in that multiple rewrites on the same URL trigger a known bug in Apache. So you'll likely have to use a "user variable" to avoid this. Here's one way to do it:


# Grab the first variable, save it in user variable "SaveVar1" and chain to next rule
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&[^\ ]+\ HTTP/
RewriteRule ^index\.php$ - [E=SavedVar1:%1,C]
#
# If chained, grab the other nine variables and then redirect using the var saved above
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=[^&]+&make=([^&]+)&min=([^&]+)&max=([^&]+)&min_year=([^&]+)&max_year=([^&]+)&garbeni=([^&]+)&lot=&s_cond=([^&]+)&out_rate=([^&]+)&color=([^&]+)&search_adv=%E1%83%AB%E1%83%94%E1%83%91%E1%83%9C%E1%83%90\ HTTP/
RewriteRule ^index\.php$ http://example.com/advanced_search/%{ENV:SavedVar1}-%1-price-%2-%3-year-%4-%5-garbeni-%6-salon-%7-rate-%8-color-%9.html? [R=301,L]

The name of the user variable is arbitrary, but take care to avoid system-defined variable names.

By using chaining, we guarantee that the second rule is only executed if the first rule matches and is invoked. So there is no danger that the user var will be undefined if only a partial match on the requested query string occurs.

Jim

dqiria

3:06 pm on Sep 22, 2008 (gmt 0)

10+ Year Member



Thank you, very much, it worked, but I couldn't have done it at the end.
You see, I want to rewrite this URI
[localhost...]

Into this:
[localhost...]

and I can't get it done...

jdMorgan

4:15 pm on Sep 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So what is the problem, that you have 12 parameters and not 10, or that you have a problem with the whole concept?

One thing that may help you is to realize that you are not rewriting the dynamic URLs to the static URLs, you are redirecting the dynamic URL to the static URL -- and then only if the dynamic URL is directly requested by the client, and not as a result of an internal rewrite (This is critically-important in order to prevent an 'infinite' rewrite/redirect loop).

Rewrites and redirects are two very-different functions, and the code above represents only one-third of the required process if you wish to use "SEO-friendly" URLs. The three steps are:

  • Change the links on your pages (by changing your script, usually) to present static "SEO-friendly" URLs. This *defines* the new URLs on the Web, where users and search engines can find, index, and follow them.

  • Add code to mod_rewrite to internally rewrite those new friendly URLs -when requested from your server- to your script's filename, moving the name/value pairs from the 'virtual subdirectory' paths in the friendly URLs to the script's query string, so that the script can generate the 'next page' and serve it to your visitors.

  • Optionally, externally redirect direct client requests for the old, dynamic, "unfriendly" URLs to the new, friendly, static ones, so as to speed up the search engines' update of your URLs in their databases, pass PageRank from the old URLs to the new ones, and recover the traffic that would otherwise be lost from old bookmarks and links on other Web sites which are pointed to your old URLs.

    The code here represents only the third, final, and optional step.

    Specifically, you want to internally rewrite
    localhost/advanced_search/honda-all-price-0-500000-year-0-2009-garbeni-121000-lot-123456-salon-all-rate-all-color-all.html
    to
    /index.php?category=lamborghini&make=all&min=0&max=500000&min_year=0&max_year=2009&garbeni=121000&lot=123456&s_cond=all&out_rate=all&color=all&search_adv=search

    And you want to externally redirect
    /index.php?category=lamborghini&make=all&min=0&max=500000&min_year=0&max_year=2009&garbeni=121000&lot=123456&s_cond=all&out_rate=all&color=all&search_adv=search
    to
    localhost/advanced_search/honda-all-price-0-500000-year-0-2009-garbeni-121000-lot-123456-salon-all-rate-all-color-all.html

    In simple terms, you have decided to re-name your "cake" URLs to "pie" URLs. So wherever you linked to "cake" in the past, you must now link to "pie". Then you tell your server that if it gets a request for "pie" you really want to serve the same cake (script) that you served before. And you also tell your server that if it gets a request for "cake", it should tell the requesting client to ask for "pie" instead, so that everyone will start calling your "cake" by its new name, "pie."

    Having done this and changed the links on your pages, the rewrite 'reconnects' the new friendly, static URLs to your dynamic-page-generating script, and the redirect "corrects" old/obsolete unfriendly URLs on the Web.

    Another related concept is that rewriting breaks the seemingly-direct relationship between a URL on the Web and the filepath on the server used to serve content for that URL. They used to be closely-related to each other, but with a rewrite, they are now completely different.

    This illustrates that a URL and a filepath are not the same thing; They are only "associated" things. mod_rewrite tells the server to associate a filepath with a URL in a new, non-default way. The basic job of a Web server is to translate a URL on the Web into a filepath in the filesystem of that server, no matter what operating system or filesystem is in use.

    This is why we call URLs and filenames by different names. And HTTP URLs were designed so that this would be possible; Otherwise, for example, you'd have to ask for something like "WebmasterWorld.com/C:\\Documents and Settings\public\web-sites\WebmasterWorld\apache\3749070.htm" to get to this thread if this site was hosted on a Windows XP-based server, and ask for a differently-formatted filepath if it were hosted on an HP-UX-based server. Finding pages on the Web would be terribly "messy," and it would be impossible to move from one kind of server to another.

    If your problem is with the 12 name/value pairs instead of the ten originally discussed, then we need to know more about your URL-set. Do you have both ten- and twelve- parameter URLs? How about more... or less? Are the name/value pairs in the old dynamic URLs always in the same order, or can they vary?

    Jim

  • dqiria

    5:11 pm on Sep 22, 2008 (gmt 0)

    10+ Year Member



    Yes, I know about rewriting and redirecting, I have been using it in simple search without problem. But since there are more than 9 variables, I can't get it worked. Please, if you car, write full code how to do this for current case. I'd be grateful :)

    g1smd

    6:37 pm on Sep 22, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    See the example code where you grab the first variable and store it.

    Change it so it grabs the first three variables instead.

    dqiria

    11:34 am on Sep 23, 2008 (gmt 0)

    10+ Year Member



    I couldn't figure it out...
    I'm making internal rewrite with this:

    RewriteRule ^advanced_search/([^-]*)-([^-]*)-price-([^-]*)-([^-]*)-year-([^-]*)-([^-]*)-garbeni-([^-]*)-lot-([^-]*)-salon-([^-]*)-rate-([^-]*)-color-([^-]*)\.html$ /index.php?category=$1&make=$2&min=$3&max=$4&min_year=$5&max_year=$6&garbeni=$7&lot=$8&s_cond=$9&out_rate=$10&color=$11&search_adv=search [L]

    and I also could't figure out how to save multiple vars in rewritecond...
    Please Help...

    g1smd

    12:03 pm on Sep 23, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    I am guessing at something like this for the two lines that need to change:

    # Grab the first three variables, save them in user variable "SaveVar1" and chain to next rule

    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&([^&]+)&([^&]+)&[^\ ]+\ HTTP/
    RewriteRule ^index\.php$ - [E=SavedVar1:%1-%2-%3,C]

    Compare that to the code in the second post of this thread.

    [edited by: g1smd at 12:13 pm (utc) on Sep. 23, 2008]

    dqiria

    12:09 pm on Sep 23, 2008 (gmt 0)

    10+ Year Member



    I don't think so. Savedvar2 should be declared as I guess, but I don't get the syntax...

    jdMorgan

    1:38 pm on Sep 23, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    You'll need to add the parameter names to that modified RewriteCond pattern, but otherwise looks good.

    You can either define SavedVar2 in addition to SavedVar1, or you can put all of the parameters into SavedVar1 as shown -- Either way will work, but the method shown is more efficient.

    dqiria, If you "don't get the syntax" then review the mod_rewrite documentation, experiment, and get familiar with it. It is our purpose here to help you learn, not to write your code for you. In general, our effort will match yours.

    Jim

    dqiria

    1:47 pm on Sep 23, 2008 (gmt 0)

    10+ Year Member



    Yeah, I got the first one, but still stuck on internal rewrite. I tried to do myself, but I always get 404 error.
    I don't know how to search about this particular problem (I mean how to call it). If you can, give me documentation link about this...
    Thanks

    jdMorgan

    1:54 pm on Sep 23, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Links to the Apache mod_rewrite documentation and a regular-expressions tutorial are on our Forum Charter [webmasterworld.com] page.

    Jim

    dqiria

    2:26 pm on Sep 23, 2008 (gmt 0)

    10+ Year Member



    Still don't get it :-(

    jdMorgan

    1:38 am on Sep 24, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    It appears that you are dropping "out_rate=all" and "search_adv=search". That is, these two name-value pairs are not used in the redirected URL.

    So, something like this:


    # Grab the first two query variables, save them in user variables, and chain to next rule
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&make=([^&]+)&[^\ ]+\ HTTP/
    RewriteRule ^index\.php$ - [E=SavCat:%1,E=SavMake:%2,C]
    #
    # If chained, grab the other query variables and then redirect using the var saved above
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=[^&]+&make=[^&]+&min=([^&]+)&max=([^&]+)&min_year=([^&]+)&max_year=([^&]+)&garbeni=([^&]+)&lot=([^&]+)&s_cond=([^&]+)&out_rate=all&color=([^&]+)&search_adv=search$
    RewriteRule ^index\.php$ http://example.com/advanced_search/%{ENV:SavCat}-%{ENV:SavMake}-price-%1-%2-year-%3-%4-garbeni-%5-lot-%6-salon-%7-rate-%8-color-%9.html? [R=301,L]

    I assumed that "s_cond=all" becomes "salon=all" because your example was inconsistent.

    Jim

    dqiria

    12:13 pm on Sep 25, 2008 (gmt 0)

    10+ Year Member



    No no, I'm fine about external rewrite. But I just could't to it internally...
    Thanks for your help

    g1smd

    12:58 pm on Sep 25, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    I wasn't thinking straight when I posted:

    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&([^&]+)&([^&]+)&[^\ ]+\ HTTP/ 
    RewriteRule ^index\.php$ - [E=SavedVar1:%1-%2-%3,C]

    What I should have said was this:

    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&[b]make=[/b]([^&]+)&[b]min=[/b]([^&]+)&[^\ ]+\ HTTP/ 
    RewriteRule ^index\.php$ - [E=SavedVar1:%1-%2-%3,C]

    In any case, I see you cracked that part already.

    dqiria

    12:20 pm on Sep 26, 2008 (gmt 0)

    10+ Year Member



    g1smd
    Yes, I did that part before. Thank you.
    The only problem left is internal rewrite...

    jdMorgan

    12:58 pm on Sep 26, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    > The only problem left is internal rewrite...

    Please ask a specific, complete, focused question.

    Jim

    dqiria

    1:07 pm on Sep 26, 2008 (gmt 0)

    10+ Year Member



    As I said before, this is the code, which does internal rewrite:
    RewriteRule ^advanced_search/([^-]*)-([^-]*)-price-([^-]*)-([^-]*)-year-([^-]*)-([^-]*)-garbeni-([^-]*)-lot-([^-]*)-salon-([^-]*)-rate-([^-]*)-color-([^-]*)\.html$ /index.php?category=$1&make=$2&min=$3&max=$4&min_year=$5&max_year=$6&garbeni=$7&lot=$8&s_cond=$9&out_rate=$10&color=$11&search_adv=search [L]

    and I was not able to save variables and use them in this code...

    g1smd

    2:08 pm on Sep 26, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Using cut and paste on your rewrite example, and the corrected redirect example, I get something like this for the rewrite:


    # Pick off first 4 variables and store them (matching up to -year- in pattern):
    RewriteRule ^advanced_search/([^-]*)-([^-]*)-price-([^-]*)-([^-]*)-year - [E=SavedVar1:category=$1&make=$2&min=$3&max=$4,C]

    # Add the first 4 to the rest and rewrite (using -year- and after):
    RewriteRule ^advanced_search/[b]([^-]*)*-year-[/b]([^-]*)-([^-]*)-garbeni-([^-]*)-lot-([^-]*)-salon-([^-]*)-rate-([^-]*)-color-([^-]*)\.html$
    /index.php?SavedVar1&min_year=$2&max_year=$3&garbeni=$4&lot=$4&s_cond=[b]$6[/b]&out_rate=$7&color=$8&search_adv=search [L]

    It's guesswork, and probably not quite right. I don't know where the $6 value in bold is supposed to come from, for example.

    I am also not sure if this is the right pattern to match multiple hyphens up to -year- too: ([^-]*)*-year-

    I might be more tempted by:


    # Pick off first 4 variables and store them (matching up to -year- in pattern):
    RewriteRule ^advanced_search/([^-]*)-([^-]*)-price-([^-]*)-([^-]*)-year - [E=SavedVar1:category=$1&make=$2&min=$3&max=$4,C]

    # Add the first 4 to the rest and rewrite (using -year- and after):
    RewriteRule ^advanced_search/[b](([^-]+-)*)-year-[/b]([^-]*)-([^-]*)-garbeni-([^-]*)-lot-([^-]*)-salon-([^-]*)-rate-([^-]*)-color-([^-]*)\.html$
    /index.php?SavedVar1&min_year=$3&max_year=$4&garbeni=$5&lot=$6&s_cond=[b]$7[/b]&out_rate=$8&color=$9&search_adv=search [L]

    Note that the backreference numbering has changed in this second example.

    I am not sure but I think that one hyphen might also need to be deleted from the rule, specifically the one just before "year", such that:

    (([^-]+-)*)-year-
    becomes
    (([^-]+-)*)year-
    instead.

    dqiria

    4:30 pm on Sep 26, 2008 (gmt 0)

    10+ Year Member



    At last I have done it (well there had to be modified something) but still.

    Thank you jdMorgan and g1smd

    g1smd

    4:32 pm on Sep 26, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Post the code that finally worked, as I was a little unsure about some of the finer detail, mainly the stuff that I highlighted in bold.

    dqiria

    6:12 pm on Sep 26, 2008 (gmt 0)

    10+ Year Member



    Here's the code:
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=([^&]+)&make=([^&]+)&min=([^&]+)&[^\ ]+\ HTTP/ 
    RewriteRule ^index\.php$ - [E=themake:%1,E=themodel:%2,E=themin:%3,C]
    RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.php\?category=[^&]+&make=[^&]+&min=[^&]+&max=([^&]+)&min_year=([^&]+)&max_year=([^&]+)&garbeni=([^&]+)&lot=([^&]+)&s_cond=([^&]+)&out_rate=([^&]+)&color=([^&]+)&search_adv=%E1%83%AB%E1%83%94%E1%83%91%E1%83%9C%E1%83%90\ HTTP/
    RewriteRule ^index\.php$ http://example.com/advanced_search/%{ENV:themake}-%{ENV:themodel}-price-%{ENV:themin}-%1-year-%2-%3-garbeni-%4-lot-%5-salon-%6-rate-%7-color-%8.html? [R=301,L]
    RewriteRule ^advanced_search/([^-]*)-([^-]*)-price-([^-]*) - [E=themake:$1,E=themodel:$2,E=themin:$3,C]
    RewriteRule ^advanced_search/[^-]*-[^-]*-price-[^-]*-([^-]*)-year-([^-]*)-([^-]*)-garbeni-([^-]*)-lot-([^-]*)-salon-([^-]*)-rate-([^-]*)-color-([^-]*)\.html$ /index.php?category=%{ENV:themake}&make=%{ENV:themodel}&min=%{ENV:themin}&max=$1&min_year=$2&max_year=$3&garbeni=$4&lot=$5&s_cond=$6&out_rate=$7&color=$8&search_adv=search [L]

    g1smd

    6:44 pm on Sep 26, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Are you sure that
    [^-]*
    shouldn't be
    [^-]+
    for at least some of those?

    You do need at least one non-space character to match.

    Just thought about that now.