Forum Moderators: phranque

Message Too Old, No Replies

Mod rewrite methods (partial urls and other)

         

darkyl

2:31 am on Nov 28, 2007 (gmt 0)

10+ Year Member



Hello,

i manage a site with urls that look like the following (ugly, i know):
www.example.com/cnt_sezioni.php~section~~~citta~~id_sezione~~~89~~ristoranti.html
www.example.com/cnt_sezioni.php~section~~~citta~~id_sezione~~~89~~id_sottosezione~~~395.html

in the first example the word "ristoranti" might be left for the rewrite, in the second there's no usable word (they mean city, section, subsection).

I've thought of 2 possible solutions to rewrite them to (semi) friendly urls.

1. With a different rewrite for each section (section 89 in this case would become /milano/ristoranti_milano) to make them look like:

1. www.example.com/milano/ristoranti_milano/ristoranti.html
2. www.example.com/milano/ristoranti_milano/id_sottosezione~~~395.html (not perfect but better than nothing)
Is it possible?

The second option i have is to rewrite all the links one by one by hand.

In both cases, is there anything else i should do? (for example to avoid duplicate content having a static and a dynamic url)

Please post the htaccess code (if you have spare time) as i am no server expert.

thanks

jdMorgan

2:43 am on Nov 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You cannot use mod_rewrite to create search-friendly URLs. The links published on your pages define the URLs.

So, you must modify your script(s) to 'print' friendly-URL links on your pages.

After that is done, you then use mod_rewrite to internally rewrite those friendly URLs, when requested from your server, back to the path required to invoke your script(s).

A final optional step is to externally redirect unfriendly URLs back to the new friendly URLs.

Note the two terms "external redirect" and "internal rewrite."

This process is described in detail in this thread [webmasterworld.com].

Feel free to post specific questions back to this thread.

Jim

darkyl

11:17 am on Nov 28, 2007 (gmt 0)

10+ Year Member



thanks for the answer jdMorgan,

you're right, i wasn't clear. i didn't mean i wanted to create links, i can modify the links published on my pages manually.

My need is to do what you described in your post, but I don't know how to do it.

Through htaccess i can rewrite the friendly urls back to the original path, but i can only do that rewriting all the links one by one.

For example to rewrite:

www.example.com/firenze/hotel_firenze/albergo_firenze.html
to
www.example.com/cnt_sezioni.php~section~~~citta~~id_sezione~~~77.html

I've added in htaccess:

RewriteRule ^firenze/hotel_firenze/albergo_firenze\.html$ /cnt_sezioni.php~section~~~citta~~id_sezione~~~77.html [L]

and it works, but i'm not sure it's optimal, should i add 301?

Then i would need to externally redirect unfriendly URLs back to the new friendly URLs, but even reading the thread you linked, i fail to understand how.

Can you help me with the example above?

I understand my method requires to rewrite every url one by one but I don't have access to the script that prints the urls.

jdMorgan

1:34 pm on Nov 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The two rules you'll need are:

# Internally rewrite friendly URL requests to script filepath
RewriteRule ^firenze/hotel_firenze/albergo_firenze\.html$ /cnt_sezioni.php~section~~~citta~~id_sezione~~~77.html [L]
#
# Externally redirect client requests for unfriendly URL to friendly URL
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /cnt_sezioni\.php~section~~~citta~~id_sezione~~~77\.html\ HTTP/
RewriteRule ^cnt_sezioni\.php~section~~~citta~~id_sezione~~~77\.html$ http://www.example.com/firenze/hotel_firenze/albergo_firenze.html [R=301,L]

There is some apparent redundancy in the second rule. This is required in order to prevent a rewrite/redirect loop; The second rule will redirect *only* if a client (browser or robot) requests an unfriendly URL. It will not redirect if the URL is unfriendly because the first rule has just rewritten it.

The reasons you end up with one rule-set per URL are twofold:
First, because the translation from one URL form to the other is not a simple "word rearrangement" or fixed character substitution, but rather, requires an "association" of one kind of data with another -- I'm guessing here, but "77" seems to be associated with fireneze, hotel, and albergo_firenze.

The second reason is that in order to do all of the required associative lookups using mod_rewrite, you'd need to write a small PERL script to access your database and do the lookup. Alternatively, you could use a text-based fixed translation table. These solutions are possible, but both require access to the server configuration in httpd.conf or conf.d in order to define a RewriteMap.

Another method is to rewrite *all* requests for these URLs to a single PHP script that looks up the desired content and then 'includes' it and sends it to the client.

Jim

darkyl

11:53 pm on Nov 28, 2007 (gmt 0)

10+ Year Member



thanks jdMorgan,

you centered the problem about about the url sintax, a single number is associated with several parameters, so i have to do it one by one.

You said

"There is some apparent redundancy in the second rule. This is required in order to prevent a rewrite/redirect loop; The second rule will redirect *only* if a client (browser or robot) requests an unfriendly URL. It will not redirect if the URL is unfriendly because the first rule has just rewritten it.

If i got it right, the second rule stops the redirection if the url has just been rewritten so that it doesn't go into a loop.
So, if i type www.example.com/cnt_sezioni.php~section~~~citta~~id_sezione~~~77.html in my browser, shouldn't i see the url actually changing to www.example.com/firenze/hotel_firenze/albergo_firenze.html in the address bar?

That is not happening right now using the rules you gave me, so i'm confused...

anyway thanks a lot, you're helping me greatly

jdMorgan

12:05 am on Nov 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The RewriteCond in the second rule (the external redirect) prevents the second rule from being applied if the URL has been rewritten (to the unfriendly form) by the first rule. The RewriteCond requires that the request for the unfriendly URL come directly from a client (browser or robot) and not be the result of an internal rewrite (such as the first rule). This is done by examining the client HTTP request header, which remains unchanged during internal rewrites.

Jim

g1smd

1:56 am on Nov 30, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is both a rewrite and a redirect here.

You need to understand what both those terms mean.

They are related, but quite different.

A rewrite translates a URL request into a server filepath to get the content from (without exposing what that path actually is).

A redirect takes a URL request and tells the browser that it needs to request some different URL, which it then does in a new HTTP request.