homepage Welcome to WebmasterWorld Guest from 54.235.29.110
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Help with URL rewrite
Sub_Seven




msg:4356187
 6:07 am on Aug 29, 2011 (gmt 0)

I would like to start by saying I'm sorry for asking this question, I've read several threads about the same topic right here in this forum and I should have probably already have a small grasp on the topic but I just don't.

I need help since all I've ever used .htaccess for is regular 301s, making sure the URL can be accessed without the ".php" extension and redirecting non www to www or viceversa, nothing else.

I have a website under construction on a test subdomain that will eventually go live on its own domain, the site has a lot of pages generated dynamically with php and I'm looking for a way to get rid of the "?n=" on the URL (I'm affraid that SEO wise search engines wont like that).

Ok (I tend to speak (write) a lot)

This is my goal:

Make this type of URLs (planning ahead for when I lose the subdomain)

https://subdomain.example.com/product?n=test
https://example.com/product?n=test

Look like this

https://subdomain.example.com/product/test
https://example.com/product/test

I did read the Apache Module mod_rewrite documentation at [httpd.apache.org...] but maybe I just don't speak apache :(

Could someone give me an example of how I could achieve this please?

Again sorry if I'm the 10th person this week to ask this question but my ocd wont let me be till I can get this done.

Thanks in advance for any help provided.

 

lucy24




msg:4356196
 7:21 am on Aug 29, 2011 (gmt 0)

Hm, it's late and I gotta get up tomorrow and the people who speak Apache* are just heading off for their day jobs in a different hemisphere, so let's see if we can fob you off with the boilerplate for a few hours ;)

Oops, I guess I need to add a line about what a "query string" is. It's the part after the question mark. Sometimes known as the parameter.
By default, rewrites simply ignore the query string. That is, mod_rewrite stashes the query in a safe place, does its stuff to the part before the question mark, and then reappends the original query.

Changing a Query

#1 To delete a query, add a ? to the end of your rewrite target.
#2 To replace a query--or create a new one--add ?blahblah to the rewrite target. The blahblah can be either literal text, or stuff you captured earlier. (#1 and #2 are really the same thing: you're just replacing the query with either something or nothing.)
#3 To add to an existing query, again put ?blahblah at the end of the target, but also include [QSA] in the bracketed material at the end of the Rule. It stands for "Query String Append", meaning that the blahblah is to be added to the existing query--if any--instead of replacing it.

Getting the Query

You only need to retrieve the original query if
#1 you want the rewrite to behave differently depending on what the query was
or
#2 you need to change or delete the query

Add a Condition that says

RewriteCond %{QUERY_STRING} blahblah

using your ordinary Regular Expressions, anchors and ! as needed.

To test whether there was a query at all

RewriteCond %{QUERY_STRING} .

which simply means "If the query contains at least one character of any kind".

If you need to capture part of the query, use parentheses as usual. In the rewrite target, the capture will be %1, %2 etc instead of $1, $2 etc, because they are coming from a Condition instead of the Rule. Each set is separately numbered, so the first capture in the Rule will still be $1.


* I don't speak Apache either-- nor yet any other Athabaskan language. My only justification for being here is that I do speak pretty good RegEx.

g1smd




msg:4356365
 5:44 pm on Aug 29, 2011 (gmt 0)

You'll need a rewrite to accept incoming URL requests for
https://subdomain.example.com/product/test and rewrite that request to fetch the content from the internal /product.php?n=test location inside the server.

To stop people asking for URLs like
https://subdomain.example.com/product.php?n=test and being served Duplicate Content, you install a redirect to bounce people over to the correct URL.

Both the redirect and the rewrite each use a RewriteRule.


A similar question was asked yesterday and here's the answer to that: [webmasterworld.com...]

There's very many very detailed threads in this forum as this type of question is asked several times per week, sometimes several times per day.

Sub_Seven




msg:4356989
 5:25 am on Aug 31, 2011 (gmt 0)

Thanks lucy24 and g1smd for taking the time to respond to my question.

* I don't speak Apache either-- nor yet any other Athabaskan language.


There has to be room for humor, although sometimes I think we are slowly changing the way our brains process things and we could be becoming robots.

My only justification for being here is that I do speak pretty good RegEx.


Since I haven't had the time to really learn the rules and all things related nor I have been forced to learn RegEx by some project or necessity I have to be honest, RegEx and I do not get along yet, OH wait, that's why I need help here, lack of skills in expressions that are regular and lack of knowledge of native languages lol...

@g1smd

Everything you said is in english, I know you wont do it (I did read that thread that was a pixel away from becoming a physical fight) but if you could write that down in a way it would make sense inside a .htaccess file it would be greatly appreciated.

Now, don't think I haven't tried, I kept reading and I am either stupid or ignorant in the subject matter, let me go for the latter, there was a time when I didn't know php at all and thanks to the great people in this forum and another forum and a lot of persistence now I know my ways around the topic.

This is the closest I got:

#RewriteCond %{QUERY_STRING} ^(\w+)=(\w+)$
#RewriteRule ^/product /%1/%2? [QSA]

The problem: it removed the parameter and I would like to keep the parameter.

Do I understand what those two lines mean? NO!

If you could please help me just a little more to get that last part running dinner lunch or breakfast (your choice) is on me next time you visit Costa Rica :)

lucy24




msg:4356995
 5:54 am on Aug 31, 2011 (gmt 0)

Go back and read the pasted-in boilerplate. It will tell you where your parameter (alias Query String) went :)

The two lines mean:
If the query string is in the exact form "one or more alphanumerics* = one or more alphanumerics"
then take any request that begins with "product" and change it to /{left side of query}/{right side of query} ... and I refer you again to the boilerplate for what [QSA] means. (Hint: It doesn't belong.)

Edit: Oh, oops, I entirely missed the part about dumping the subdomain in the original post. That calls for a second RewriteCond, because the Rule doesn't see the (sub)domain. And then the whole thing has to become a Redirect because the new version includes a domain name. But that's OK, because there are several thousand posts addressing that part of the question. Just make sure you're clear on what you yourself mean by "look like". That is, do you mean what it looks like to you and the computer, or what it looks like to the user who happens to glance at their browser's address bar?


* Or possibly "alphanumerics or lowlines". I don't know offhand whether this is dialect-specific.

Sub_Seven




msg:4357486
 6:26 am on Sep 1, 2011 (gmt 0)

Hey Lucy,

I am dedicating the last hour of my day to try to figure this out, gotta admit that I am understanding regular expressions much better now, because I have tried several things based on the Regex vocabulary found on [httpd.apache.org...] and a few other sites.

Based on your answer I tweaked my code to this:

RewriteCond %{QUERY_STRING} ^(\w+)=(\w+)$
RewriteRule ^/product /%1/%2

I removed the ? [QSA] at the end of the rewrite target because, for what I can understand, the ? was deleting the query and the [QSA] was trying to append, in this case, nothing since the ? was the last character. However this is not working at all.

This is what happens

The query string is still gone and for some reason the redirect, although not working as it should, only happens in Opera...? Chrome, Firefox and stupid ie behave like nothing has changed at all.

By "Look like" I mean what it looks like to the user on the address bar.

Is your edit telling me this is not working because I'm on a subdomain while testing?

Please help me out a little more and thanks so much for everything.

g1smd




msg:4357509
 7:21 am on Sep 1, 2011 (gmt 0)

Add the [L] flag to every RewriteRule.

Flush your browser cache before each test. Redirects are cacheable.

Use the Live HTTP Headers extension for Firefox to see what is happening.

Sub_Seven




msg:4357519
 7:46 am on Sep 1, 2011 (gmt 0)

Hey g1smd

Add the [L] flag to every RewriteRule.


I have found a huge regex reference at [perldoc.perl.org...] I looked for the [L] flag to learn what it does and I didn't find it, would you mind telling me what it does?

Looks like this now:

RewriteCond %{QUERY_STRING} ^(\w+)=(\w+)$
RewriteRule ^/productos /%1/%2 [L]

Didn't really work :(

Flush your browser cache before each test. Redirects are cacheable.


I missed to mention I do that everytime I make a change and want to test again, I actually delete all data from the browser and restart it, I even flush my dns hoping that could make a difference as well...

Use the Live HTTP Headers extension for Firefox to see what is happening.


Will do as soon as I post this reply.

Thanks for the help.

lucy24




msg:4357520
 7:46 am on Sep 1, 2011 (gmt 0)

I cannot begin to imagine why your browser should have any effect on what's happening in htaccess, so if it's all the same with you, I'm not touching that part with a barge pole. Except to note that Opera is sometimes especially good at handling things that are, ahem, malformed or sloppily coded. So there may be a mistake that isn't serious enough to bring the site crashing to a 500, but serious enough that only Opera can figure out what it's supposed to do.

If the rewrite is happening anywhere, it means the query string has to be present, or the new url couldn't be generated.

By "Look like" I mean what it looks like to the user on the address bar.

OK, so you want a redirect, not a rewrite. That means [R=301] at the end-- and possibly one more capture from one more Condition, because you need a full

blahblahblah http://www.example.com/%1/%2 [R=301]

in your target. Only it won't always be www.example.com; it will be whatever you pick up in

RewriteCond %{HTTP_HOST} ((?:www\.)?(?:\w+\.)example\.com)
or possibly even
%{THE_REQUEST}

:: looking vaguely around for g1 or someone like him to come to my rescue as it becomes obvious I am making this up as I go along ::

The ?: means "don't capture" (and therefore don't include it in the count of %1, %2 and so on). It is not functionally necessary; it just makes it easier to keep track when you have a bunch of stuff in parentheses, especially nested ones.

The Rule will involve
http: // %1 [minus spaces]
if you want to keep the original (sub)domain, changing only the rest of the address. The optional-www part may not be necessary if you are dealing only with requests from within your site, because then you have control over the form. The next (\w+\.)? is the optional subdomain. If none of your subdomains start with the same letter as your central domain, you can probably save lots of nanoseconds by expressing the package as
([^e]\w+\.)?example\.com
so the RegEx stops dead in its tracks if the first thing it meets after www. is "example". Otherwise it has to backtrack-- but only once, if it discovers that the "\w+\." it captured was the "example\." that it has to look for next.

Add-on after seeing overlapping posts:
The [L] flag goes with RewriteRules. It means "stop here, go back to the beginning of the htaccess, and repeat until everything rinses clean". You should always put [L] at the end of each Rule unless you know exactly what you are doing. And if you are that good with Skips and Chains, you would be answering questions, not asking them ;) The only time you definitely don't need an [L] is when you've already got [F], which means "drop dead, do not pass Go, do not collect $200". Even there the [L] will do no harm, it just isn't needed.

Sub_Seven




msg:4357531
 8:17 am on Sep 1, 2011 (gmt 0)

Ok Lucy and g1smd, it's 2:16 in the morning and for some reason I think I'm not making much progress in understading this (I mean, the whole thing took a turn when Lucy told me I was going at it the wrong way (which I was already beginning to understand)). I may need to sleep and probably have regex nightmares so I can start fresh tomorrow, I'm not giving up until I remove from my URL (i?) those stupid ? and = characters.

I appreciate your willingness to pass over your knowledge, you guys are awesome!

lucy24




msg:4357723
 7:45 pm on Sep 1, 2011 (gmt 0)

Postscript: By the clear light of day, I suspect that everything I said about subdomains was talking through my hat and you don't really need to do anything. Or, possibly, you need to do something entirely different.

But if you've got a site you can experiment on without causing untold (or at least 500) disasters, go for it. (I'm currently in the same position myself, and boy is it reassuring.)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved