homepage Welcome to WebmasterWorld Guest from 107.22.37.143
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Stubborn Rewrite or Redirect?
Joomla 1.5 to 2.5 problem...
BillyS




msg:4566442
 6:51 pm on Apr 19, 2013 (gmt 0)

We just migrated from Joomla 1.5 to version 2.5. I've solved all of the migration problems except for one RSS feeds. When upgrading from Joomla 1.0 to 1.5, I was able to wrangle all of the stray feeds with the following Thanks Lucy :)

RewriteCond %{QUERY_STRING} option=com_rss [NC]
RewriteRule ^index2\.php$ /rss.feed? [R=301,L]

This allowed me to point the old URI to:

example.com/rss.feed

I'm trying to use the same approach in Joomla 2.5 but I'm either brain dead at this point or missing something very obvious. In Joomla 2.5 the rss feeds can be found in this form:

example.com/?format=feed&type=rss/

So my solution is (the one that doesn't do anything):

RewriteCond %{QUERY_STRING} !format=feed [NC]
RewriteRule ^index2\.php$ /rss.feed? [R=301,L]

As always, help is very much appreciated.

 

lucy24




msg:4566460
 8:08 pm on Apr 19, 2013 (gmt 0)

Color me confused. Your RewriteCond says "If the query string does NOT contain 'format=feed' then strip the query and redirect to /rss.feed."

If you use the [R] flag then your rule creates a redirect whether or not you've included the protocol-plus-domain-- but it's a sloppy redirect because you're not canonicalizing the hostname. Conversely if you do include the protocol-plus-domain then the rule becomes a redirect even if you don't use the [R] flag. Unless you've instead used the [P] flag, but let's not make it more complicated than it needs to be.

example.com/?format=feed&type=rss/

Eeuw. Is that supposed to be the "before" or "after" form? I guess it has to be what the user sees, because a real file doesn't end in / alone.

Better backtrack a bit and explain in English what you want to do. What the user is supposed to see; where the content lives.

Any new rules you make need to come before the CMS rules of the same type. Your redirects come before their redirects; your rewrites come before their rewrites. Exceptions exist, but we'll deal with those if-and-when they arise.

Rules involving redirects and query strings almost always need a preceding RewriteCond looking at THE_REQUEST so you don't go around in circles. And, as always, make sure your own internal links point to the URL you want people to see. Redirects are for outdated links and bookmarks coming from outside. And for search engines with long memories.

Dideved




msg:4566467
 8:17 pm on Apr 19, 2013 (gmt 0)

Hi, BillyS. So to clarify, the old URL is "rss.feed" and the new URL is "index2.php?format=feed&type=rss/"? Then it seems like this should work.

RewriteRule ^rss\.feed$ /index2.php?format=feed&type=rss/ [R=301,L]

BillyS




msg:4566470
 9:00 pm on Apr 19, 2013 (gmt 0)

I am brain dead, my apologies... been working on this migration for two days. It's gone pretty flawlessly except for this RSS feed thing. Shows my weakness in these rules. :(

I took a hot shower, and I'm thinking I have two problems. The first is I want to make the new RSS feed look like the old one, which was in the form:

example.com/rss.feed

So I need to rewrite:
example.com/?format=feed&type=rss/ (Yes, that's Joomla's form for the feed)
as:
example.com/rss.feed

Getting this right would solve 80% of my problem... Most bots (and people) are looking for the RSS feed at the second location. I was playing with the code before and forgot to remove the NOT...

I'm still thinking the following should work :(

RewriteCond %{QUERY_STRING} format=feed [NC]
RewriteRule ^index2\.php$ /rss.feed? [R=301,L]

BillyS




msg:4566474
 9:03 pm on Apr 19, 2013 (gmt 0)

Better backtrack a bit and explain in English what you want to do. What the user is supposed to see:

www.example.com/rss.feed

where the content lives:

www.example.com/?format=feed&type=rss

lucy24




msg:4566508
 1:15 am on Apr 20, 2013 (gmt 0)

I'm still thinking the following should work

Seems like it ought to, if we've now established that the ! in the first version was a typo. But you need a bit more.

#1 As with any rule that creates a redirect, give the full protocol-plus-domain in the target.

#2 When the redirect is part of a redirect-and-rewrite package, the redirect half of the rule needs a preceding RewriteCond looking at THE_REQUEST. Exact wording will depend on how many different requests are possible, and how many different forms. In a perfect world, all you'd need is a single condition that says in part

%{THE_REQUEST} \?format=feed

and then jump straight into your Rule. But if it is possible for the original request to have more stuff in the query string besides the part you're looking for, you may need to go to a two-part version:

RewriteCond %{THE_REQUEST} \?
RewriteCond %{QUERY_STRING} (^|&)format=feed($|&)

meaning "the original request included a query string" --a literal question mark in the request can't mean anything else-- "and one piece of that query is 'format=feed'".

And finally #3 you need to make the corresponding RewriteRule that takes a request for ^rss\.feed$ and rewrites to serve content from whatever-it-is. Presumably you've got this part already.

Dideved




msg:4566510
 1:40 am on Apr 20, 2013 (gmt 0)

> #1 As with any rule that creates a redirect, give the full
> protocol-plus-domain in the target.

This seems to be another of your rules that is contrary to both standard practice and the official documentation. Care to provide your reasoning?

BillyS




msg:4566511
 1:47 am on Apr 20, 2013 (gmt 0)

I've been working on this for the last hour or so, half the problems I have is I don't clear the browser cache enough...
This part actually works:
RewriteCond %{QUERY_STRING} format=feed [NC]
RewriteRule ^index2\.php$ /rss.feed? [R=301,L]

But it sounds like it could be improved using the full www.example.com

So right now, I can get this...

http://www.example.com/index.php?format=feed&type=rss/

and send it to:

http://www.example.com/rss.feed
Which used to exist in the old system, but doesn't exist anymore. Because I didn't think about the third point you bring up.

I've had several thousand requests for http://www.example.com/rss.feed since the switch. They're all going to 404 for another 5 or 6 hours.

Going to bed and getting some rest is better than crying, I'm sure I'll wake up at 500 am trying to figure this out. I've about another hour to crack this one before it cracks me.

Thanks for trying... it might take a couple of days but I'll figure it out.

Dideved




msg:4566513
 1:58 am on Apr 20, 2013 (gmt 0)

The rewrite rule I posted earlier ought to have cleared up your 404 problems. I even tested it before I posted. Did it not work?

BillyS




msg:4566516
 2:13 am on Apr 20, 2013 (gmt 0)

Dideved -

I suspect you're a mind reader or you're familiar with Joomla 2.5 :)

After a bit of a cry, I came up with the following, which seems to be of the same method you proposed.

Redirect 301 /rss.feed http://www.example.com/index.php?format=feed&type=rss

Which doesn't make the new URI look pretty, but it does provide a working URI to those looking at the old one.

I'm going to test your suggestion too.

RewriteRule ^rss\.feed$ /index2.php?format=feed&type=rss/ [R=301,L]

If they both work, I'm hoping Lucy will weigh in with a best practice. :)

Dideved




msg:4566517
 2:45 am on Apr 20, 2013 (gmt 0)

> I suspect you're a mind reader or you're familiar with Joomla 2.5 :)

Mind reader. ;)

> If they both work, I'm hoping Lucy will weigh in with a best practice.
> :)

For what it's worth, the Apache documentation can weigh in. httpd.apache.org/docs/trunk/rewrite/avoid.html#redirect

It says your way is better. :) mod_rewrite is actually supposed to be a last resort in favor of other directives designed to do a specific job (like redirect).

Optionally, you could simplify it just a touch by removing the domain name. Apache will use the current scheme and hostname.

Redirect 301 /rss.feed /index.php?format=feed&type=rss

lucy24




msg:4566527
 4:06 am on Apr 20, 2013 (gmt 0)

half the problems I have is I don't clear the browser cache

Tip: If you have a test site or a pseudo-server (MAMP or equivalent), slip in this package:

ExpiresActive On

ExpiresByType text/html "access"
ExpiresByType text/php "access"

It means that if you have an obedient browser-- and an obedient ISP --you don't have to keep on refreshing pages and emptying the cache, because each access will result in a fresh request to the server. Obviously you can't do it on the real site that humans visit. At most, you could designate some hidden backwater of your real site as "testing only".

Apache says (here quoting from mod_rewrite docs, but mod_alias docs say essentially the same thing):
Use of the [R] flag causes a HTTP redirect to be issued to the browser. If a fully-qualified URL is specified (that is, including http://servername/) then a redirect will be issued to that location. Otherwise, the current protocol, servername, and port number will be used to generate the URL sent with the redirect.

If the original request happened to use the exactly correct protocol and hostname, then it makes no difference whether the target of a redirect-- whether via mod_alias or mod_rewrite --begins in / alone or the full protocol-plus-host. But it would obviously be senseless to prefix each RewriteRule with a RewriteCond checking whether the HTTP_HOST is already correct, and running different rules depending on whether it is or isn't. Feed in the correct form each time, and everything will come out correct.


mod_alias vs. mod_rewrite
THIS IS IMPORTANT.

The Apache documentation is primarily written for the people who own the server. When it is your own server, you know which mods load in which order, and which rules execute when. On shared hosting, you may or may not know, and it is definitely out of your power to change it.

The one thing you can be sure of is that external redirects must come before internal rewrites. In mod_rewrite, you can attach a RewriteCond to every single rule to look at THE_REQUEST. In mod_alias, there are no conditions; it's like a string of conditionless RewriteRules.

Dideved




msg:4566528
 4:22 am on Apr 20, 2013 (gmt 0)

> ...In mod_alias, there are no conditions...

You can actually still write conditions with the <If> directive. See:

httpd.apache.org/docs/2.4/expr.html#examples
httpd.apache.org/docs/2.4/mod/core.html#if

phranque




msg:4566539
 5:55 am on Apr 20, 2013 (gmt 0)

For what it's worth, the Apache documentation can weigh in.httpd.apache.org/docs/trunk/rewrite/avoid.html#redirect

It says your way is better. happy! mod_rewrite is actually supposed to be a last resort in favor of other directives designed to do a specific job (like redirect)


in the section of the apache doc to which you referred it shows examples using <VirtualHost> containers which only work in server config context.
based what i have read here these past several years, i would hazard a guess that most WebmasterWorld visitors asking questions in this forum don't have access to their server config file.

http://httpd.apache.org/docs/current/rewrite/avoid.html [httpd.apache.org]:
The most common situation in which mod_rewrite is the right tool is when the very best solution requires access to the server configuration files, and you don't have that access. Some configuration directives are only available in the server configuration file. So if you are in a hosting situation where you only have .htaccess files to work with, you may need to resort to mod_rewrite.



Optionally, you could simplify it just a touch by removing the domain name. Apache will use the current scheme and hostname.

what if the requested scheme or hostname is non-canonical for the resource?
your suggestion implies multiple redirects before you request the intended resource at the canonical url.
this means more wait time for the user (and this is actual measurable clock time, not "nanotime") and gives the appearance of low technical quality to search engines.

http://www.youtube.com/watch?v=r1lVPrYoBkA#t=2m46s
If you can do it in one hop, that's ideal.

lucy24




msg:4566546
 7:02 am on Apr 20, 2013 (gmt 0)

You can actually still write conditions with the <If> directive.

I must say this is a usage of the word "still" which I had not previously encountered, since the <If> formulation was only introduced in Apache 2.4.

Edit:
Ooh, phranque, that's way cool. I never knew you could include exact time in a YouTube URL :)

Dideved




msg:4566558
 8:54 am on Apr 20, 2013 (gmt 0)

> i would hazard a guess that most WebmasterWorld visitors asking
> questions in this forum don't have access to their server config file.

Fortunately the redirect directive can be placed in htaccess files, so the OP is still good.

> I must say this is a usage of the word "still" which I had not
> previously encountered, since the <If> formulation was only introduced
> in Apache 2.4.

In this context, it means "in addition to." We can apply conditions to rewrite rules, and we can apply conditions to... anything else, really.

phranque




msg:4566565
 9:56 am on Apr 20, 2013 (gmt 0)

Fortunately the redirect directive can be placed in htaccess files, so the OP is still good.


as far as BillyS is concerned, there's some joomla involved which means the .htaccess file necessarily contains mod_rewrite directives for the internal rewrite to the joomla script.

http://httpd.apache.org/docs/2.2/rewrite/avoid.html#redirect
The use of RewriteRule to perform this task may be appropriate if there are other RewriteRule directives in the same scope. This is because, when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file.

Dideved




msg:4566567
 10:09 am on Apr 20, 2013 (gmt 0)

That's a good point. :)

BillyS




msg:4568594
 11:59 pm on Apr 27, 2013 (gmt 0)

Okay, I'm rested up from troubleshooting the migration and I'm still trying to crack this problem.

I can use:

Redirect 301 /rss.feed http://www.example.com/index.php?format=feed&type=rss


To redirect the old RSS feed URL to the new location. This works, because the old RSS feed location was in the format:

www.example.com/rss.feed

Unfortunately the new location of the actual data is in the format:

http://www.example.com/index.php?format=feed&type=rss

This format doesn't validate and I'm crazy about that stuff. I'm thinking I really should have just pointed the URL with the actual rss information to the old RSS URL. So I'm thinking the below should have done that:

RewriteRule ^rss\.feed$ /index.php?format=feed&type=rss/ [L]

But this just gives me a 404. What am I missing?

Bill

BillyS




msg:4568595
 12:01 am on Apr 28, 2013 (gmt 0)

Once again, your help and expertise is very much appreciated.

Dideved




msg:4568603
 12:25 am on Apr 28, 2013 (gmt 0)

Unfortunately the new location of the actual data is in the format:

http://www.example.com/index.php?format=feed&type=rss

This format doesn't validate and I'm crazy about that stuff.


Doesn't validate? Do you mean the HTML validator complains about this URL in links? If that's the case, then I'd guess the most likely reason is that the & needs to be escaped as &amp;

lucy24




msg:4568607
 12:44 am on Apr 28, 2013 (gmt 0)

RewriteRule ^rss\.feed$ /index.php?format=feed&type=rss/ [L]

But this just gives me a 404. What am I missing?

The final directory slash? Or was that just a typo?

I'm thinking I really should have just pointed the URL with the actual rss information to the old RSS URL.

Other way around, you mean, right? The old URL should point to (= serve content from) the new location unless you've got a compelling reason to change the URL.

BillyS




msg:4568616
 2:16 am on Apr 28, 2013 (gmt 0)

If that's the case, then I'd guess the most likely reason is that the & needs to be escaped as &amp;


Thanks you very much, the site validates once again. :)

BillyS




msg:4568617
 2:21 am on Apr 28, 2013 (gmt 0)

The final directory slash? Or was that just a typo?


Yes, it was a typo... :(

Other way around, you mean, right? The old URL should point to (= serve content from) the new location unless you've got a compelling reason to change the URL.


Yes, that's the ideal situation. I don't want to change the URL.

The old URL was:

http://www.example.com/rss.feed

The content exists here:

www.example.com/index.php?format=feed&type=rss

The following does not work:

RewriteRule ^rss\.feed$ /index.php?format=feed&type=rss [L]

lucy24




msg:4568629
 3:35 am on Apr 28, 2013 (gmt 0)

Uh-oh, it's the dreaded "does not work" ;) What happens with your existing rule? A plain 404, with no visible change to the browser's address bar?

Are you on shared hosting? You probably said at some point, but I forget. Presumably yes, since your pattern doesn't begin with a directory slash. If yes, then a RewriteLog is out. Another quick test is to temporarily change your rewrite into a redirect, either by shoving in the [R] flag or by adding the full protocol-plus-domain to the target. You will now see the browser's address bar changing-- or not changing, as the case may be.

Can you navigate to the long icky URL by entering its address manually? You may need to comment-out any existing redirects that were intended to intercept this kind of request.

BillyS




msg:4568675
 12:38 pm on Apr 28, 2013 (gmt 0)

Right now I have this simple redirect:

Redirect 301 /rss.feed http://www.example.com/index.php?format=feed&type=rss

Which does redirect properly. (Although it's not what I want).

When I substitute (commenting out the above redirect too) this line:

RewriteRule ^rss\.feed$ /index.php?format=feed&type=rss [L]

I get a 404 when I go to www.example.com/rss.feed

I just tried this form:

RewriteRule ^www.example.com/rss.feed$ www.example.com/index.php?format=feed&type=rss/ [L]

But still get a 404. I tried moving this rule higher in the htaccess, still a 404.

[edited by: bill at 2:56 am (utc) on Apr 29, 2013]
[edit reason] fixed [/edit]

phranque




msg:4568679
 1:06 pm on Apr 28, 2013 (gmt 0)

make sure your RewriteRule is preceded by this:

RewriteEngine on

BillyS




msg:4568685
 1:23 pm on Apr 28, 2013 (gmt 0)

Thanks phranque, I have that one covered.

Not on shared hosting... This is a pretty standard Joomla 2.5 set up. I've read what g1smd has posted here and elsewhere, including the importance of the order of these rules. The strange thing is I had a very similar setup in Joomla 1.5 and it worked (same server...). I'm wondering if there is some kind of conflicting rules in the core too. I'm going to investigate that possibility.

phranque




msg:4568739
 7:47 pm on Apr 28, 2013 (gmt 0)

if you have access to the server config file you can try rewrite logging.

lucy24




msg:4568749
 9:55 pm on Apr 28, 2013 (gmt 0)

Not on shared hosting.

I tried moving this rule higher in the htaccess

If it's your own server, what are you doing in htaccess at all? Is this simply for testing?

I just tried this form:

RewriteRule ^www.example.com/rss.feed$ www.example.com/index.php?format=feed&type=rss/ [L]

URK!

:: quick detour to beginning of thread ::

Whew. You're not the person whose URL paths included domain-name info for some arcane technical reason that now escapes me. So get those domain names outta there.

I'm wondering if there is some kind of conflicting rules in the core too.


:: further detour ::

Nope, this also isn't one of the recent threads where we talked about the horrors of RewriteOptions.

Short version: By default, mod_rewrite activity is not inherited. This means that if a request meets more than one package of RewriteRules, results of any earlier rules affecting the new directory-- up to and including [F] --are abandoned as if they had never happened. If options are set to "inherit"-- again, not the default-- RewriteRules are remembered.

:: final detour to test site to double-check some variations ::

The "inherit" option is not itself inherited. Is your head spinning yet?

[edited by: bill at 2:57 am (utc) on Apr 29, 2013]
[edit reason] fixed [/edit]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved