Forum Moderators: phranque

Message Too Old, No Replies

Help with RewriteRule

Override a URL containing language

         

luismartin74

3:54 pm on Nov 18, 2011 (gmt 0)

10+ Year Member



I need to manage languages in a website, and this is the structure of the URL:

www.mysite.com/en/displaylist
www.mysite.com/fr/displaylist

I want to rewrite the URL so that the language is not in the URL path, and sending it as a GET variable, like this:

www.mysite.com/displaylist?lang=en

This is the RewriteRule I made, but it doesn't work:

RewriteRule ^(es|en|fr|de|it|ru)/(.*) $2?lang=$1 [L]

Any help appreciated.

wilderness

6:06 pm on Nov 18, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's some relatively recent threads that are variations of what you wish.

try a search on "country 2011"

lucy24

2:52 am on Nov 19, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What do you mean by "doesn't work"? Nothing happens at all, or something unintended happens, or the server goes haywire, or something else entirely?

The Rule certainly looks as if it should work. At least, nothing is jumping up and hitting me in the face. That's assuming the original URLs either have no query string, or you're throwing it away.

luismartin74

2:53 pm on Nov 20, 2011 (gmt 0)

10+ Year Member



Hello,

I get a 404 error message, not from Apache but from the own application.

g1smd

4:33 pm on Nov 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With URLs like
www.mysite.com/en/displaylist
you're already using URL rewriting.

You need to look at the exiting rewriting rules, to deduce the real server path, filename and parameters.

You need to then adjust those existing rules to use the new URL format you require. You also need to make additional new rules to redirect the old URLs to the new URLs otherwise you will have duplicate content problems.

Finally, and most importantly, you need to alter the links on the pages to point to the new URLs. URLs are defined in links. They are not made by redirects or rewrites.

luismartin74

6:08 pm on Nov 20, 2011 (gmt 0)

10+ Year Member



Will this be interpreted as duplicate content? AFAIK Google will associate the very first URL previous to any internal Apache rewrite to the content displayed.

I just need to do this:
From
www.mysite.com/en/displaylist
to
www.mysite.com/displaylist?lang=en

I'm using CodeIgniter, and CodeIgniter itself will internally parse this final URL just like (well, the variables names might be different, just to illustrate):
www.mysite.com?controller=displaylist&method=index&lang=en

lucy24

7:13 pm on Nov 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Neither search engines nor human visitors know they've been rewritten. That's fundamentally what a rewrite means. You could take every single request and change them to

RewriteRule ({blahblah}) http://www.example.com/index.php?$1 [L]

and nobody would ever know.

I get a 404 error message, not from Apache but from the own application.

Uhm, from what "own application"? 404s are sent by the server. They mean "I couldn't find the page." If there has been a redirect, the 404 refers to the redirected address; the server doesn't care whether the original form even exists. If there has been a rewrite, the 404 refers to the rewritten address. This means-- unnerving but true-- that the end user is now involved with three locations:

--the URL visible in their address bar, which may or may not correspond to a real location
--the physical location you're trying to serve content from
--the physical location of the 404 page

Duplicate content is when more than one visible URL serves the same content. Technically that includes things like parallel with-and-without www domain names, and the official form {blahblah}/ vs. the full name {blahblah}/index.html. But g### is just barely bright enough to disregard those.

Redirecting many different requests to the same URL is not Duplicate Content. Not necessarily a good idea, but not Duplicate Content. Rewriting many different requests to the same physical location (with the same parameters) is Duplicate Content.

Every time I see this thread title I think "Language" in the old-fashioned sense, as if you needed to rewrite users' cussing.

luismartin74

9:55 pm on Nov 20, 2011 (gmt 0)

10+ Year Member



Uhm, from what "own application"? 404s are sent by the server.


Doh! Yes that's true. What a fool I was by saying that. I was actually getting the 404 error page of the application, because Apache didn't find the requested document.

Anyway I'm still stuck with this.

Will this rule really rewrite something like
www.mysite.com/en/displaylist
to
www.mysite.com/displaylist?lang=en ?

lucy24

11:07 pm on Nov 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: going back to your original rule ::

RewriteRule ^(es|en|fr|de|it|ru)/(.*) $2?lang=$1 [L] 


This translates as: anything in the form

www.example.com/(en|es|etc.)/(blahblah)

will be rewritten as

www.example.com/(blahblah)?lang=(en|es|whatever)

deleting the previous query, if any. But your made-up example

?controller=displaylist&method=index&lang=en

implies that there may be other stuff in the query string. Do you need to keep this? If so, add [QSA] to your final flag. It's for "Query String Append" meaning that your new query-- the part about language-- gets added to the existing query, if any.

luismartin74

8:14 am on Nov 23, 2011 (gmt 0)

10+ Year Member



Thanks Lucy, I didn't know that flag.

It doesn't work either, but I think I know what might be happening:

I'm using PHP on CodeIgniter. This has nothing to do with Apache, but I will tell you just in case you want to know. CodeIgniter URL's work this way:

www.domain.com[/subdirectory]/controller/method/parameter1/parameter2 ...

This is the default config (it can be remapped dynamically). If I add a query string, I think the CI router misunderstands this for the relative links inside the HTML page (links to scripts, css, images, etc), and takes the controller as a directory in these cases (not for the html document).

I found out this by manually typing the final URL. The content was being displayed without styles, images and scripts. The relative paths point to the same address plus a nonexistent base directory (which corresponds to the controller) added to the document root.

I will have to do a research in CodeIgniter.

Cheers!