Forum Moderators: phranque

Message Too Old, No Replies

Redirect 301 Fancy Apostrophe

         

stemc

6:17 pm on Aug 23, 2010 (gmt 0)

10+ Year Member



Hi there,

I'm using a PHP/MySQL CMS on a Linux web server that is running Apache 2.0.

I've just moved a website from one CMS to another, and was trying to setup some 301 redirects in my htaccess file as the URL's are slightly different in the new CMS.

I'm trying to redirect this: /blog/2010/06/28/david’s-favourites/ to this: [mywebsite.com...] but I can't seem to get it working.

Here's the code I'm trying in the .htaccess file:


RewriteEngine On

# This removes index.php from the URL, for cleaner URLs with the CMS
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]

# Old blog redirect
Redirect 301 /blog/2010/06/28/david’s-favourites/ http://www.mywebsite.com/blog/david’s-favourites/


It doesn't seem to be working due to that apostrophe in the URL from the old Wordpress site. I've tried replacing the fancy apostrophe with %e2%80%99 but that doesn't work either.

Any ideas on how I can get this redirect working?

Thanks,

Stephen

jdMorgan

2:09 am on Aug 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could use a RewriteCond to look at the undecoded URL as requested by the client, but how about just redirecting /blog/2010/06/28/david<one or more characters not equal to "s">s-favourites/, in order to keep things simple?

And while we're here, we can prevent thousands of unnecessary and slow disk check per day by modifying the other routine as well...

RewriteEngine on
#
# Externally redirect old blog URL to new blog URL
RewriteRule ^blog/2010/06/28/david[^s]+s-favourites/$ http://www.mywebsite.com/blog/david’s-favourites/ [R=301,L]
#
# Internally rewrite all requests for URLs which do not resolve to physical files or directories to
# the CMS script, as long as that script can generate content for those URLs. Note: This is a major
# efficiency tweak because it avoids unnecessary and very resource-intensive "exists" checks.
RewriteCond $1 !(^index\.php|(\.(gif|jpe?g|png|css|js)))$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php/$1 [L]

Jim

[edit] Corrected as noted below. [/edit]

[edited by: jdMorgan at 2:39 pm (utc) on Aug 24, 2010]

wildbest

4:46 am on Aug 24, 2010 (gmt 0)

10+ Year Member



RewriteCond $1 !(^index\.php|(\.(gif|jpe?g|png|css|js))$

There is a missing ')' at the end of this rewrite condition.

Jim, I always wanted to ask you, what is the content of '$1' variable when used in a rewrite condition? Is this the URI?

g1smd

7:23 am on Aug 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The $1 value used in the RewriteCond comes from the first pattern enclosed in parentheses found in the RewriteRule.

The RewriteRule pattern is evaluated before the RewriteCond(s).

wildbest

8:59 am on Aug 24, 2010 (gmt 0)

10+ Year Member



Thank you very much, g1smd. I did't know the RewriteRule patterns are evaluated before the RewriteCond(s). But then, how is it evaluated first if we have %1, %2, %3 etc. values in the RewriteRule?

And if there are 2 or more RewriteRule(s) in one RewriteRule/RewriteCond set, which rule is related to $1 value if there are parentheses in all of them?

I apologize for those very basic questions, but in mod_rewrite manuals I've read so far they are not explained very well.

jdMorgan

2:48 pm on Aug 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wildbest,

Well-spotted on the missing parenthese. I corrected the code above to prevent further "propagation" of bad code.

As documented in the "Rule processing" section (worth a read) of the mod_rewrite documentation, the RewriteRule pattern is evaluated first. If the pattern matches the requested localized URL-path, then the RewriteConds are processed until any [OR]ed RewriteCond matches, or until all (AND)ed RewriteConds match. The RewriteConds may refer to the $1 through $9 back-references already created by the RewriteRule pattern matching, and in turn, the last-matched RewriteCond's back-references then become available for use in the RewriteRule substitution, or for use by subsequent (AND)ed RewriteConds.

In all cases, you cannot back-reference the contents of a negative-match RewriteCond or RewriteRule pattern-match because by definition, if a negative-match pattern matches, then the back-reference will be empty (because it did not match).

I don't understand your question about "if there are 2 or more RewriteRule(s) in one RewriteRule/RewriteCond set" because RewriteConds cannot be shared between two (or more) RewriteRules. Each RewriteRule "owns" all of its preceding RewriteConds, and those RewriteConds can have no effect on subsequent rules (except indirectly due to logical precedence or the use of chained rewriterules).

Jim

wildbest

3:41 pm on Aug 24, 2010 (gmt 0)

10+ Year Member



RewriteConds cannot be shared between two (or more) RewriteRules. Each RewriteRule "owns" all of its preceding RewriteConds, and those RewriteConds can have no effect on subsequent rules

This answers my question. Thank you, Jim.

stemc

3:54 pm on Aug 24, 2010 (gmt 0)

10+ Year Member



Hi Jim,

Thanks for your reply and help with this.

The reason I'm trying to specify individual re-directs is because the some of the blog titles have changed too, as the old CMS also had a longer url title than the new one. I managed to redirect about 80 of the blog posts using this method (and manually editing where urls's change became shorter), but it was just half a dozen that had these annoying symbols in that I was stuck with.

Is there a way to deal with these on an individual basis at all?

By the way, would that revised routine for eliminating the unnecessary requests also work in conjunction with the redirect 301 syntax I was using in my original post? I'm asking (before I break the site! :-)

Thanks,

Stephen

jdMorgan

6:40 pm on Aug 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Is there a way to deal with these on an individual basis at all?

I don't know what you mean. The posted rule redirects only the URL as described in the comments. It simply 'does not care' what character or characters appear in the position where the encoded apostrophe now appears.

That is, the rule will redirect requests for "blog/2010/06/28/david<something here, we don't care what>s-favourites/

Therefore, unless you also have URLs that contain things like "blog/2010/06/28/davidson's-favourites/" or "blog/2010/06/28/davidoff's-serious-favourites/", you're unlikely to have any "collisions" between URL-strings that result in unwanted redirects.

If it worries you, then fully-specify the un-decoded string exactly as requested by the client using an additional RewriteCond per rule:

# Externally redirect old blog URL to new blog URL
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /blog/2010/06/28/david\%[Ee]2\%80\%99s-favourites/\ HTTP/
RewriteRule ^blog/2010/06/28/david[^s]+s-favourites/$ http://www.mywebsite.com/blog/david’s-favourites/ [R=301,L]

But note that the RewriteRule pattern remains as I wrote it above, for the sake of efficiency. Since the rule demonstrably cannot 'see' the un-decoded characters, we have to give it a *somewhat-generic* pattern to match.

---

> work in conjunction with the redirect 301 syntax I was using in my original post?

Do not use "Redirect 301 syntax" at all any more. Replace all current Redirect and RedirectMatch directives with equivalent RewriteRules. If you mix mod_alias and mod_rewrite directives, you may lose control of which directives execute first, because directives are processed on a per-module basis and *not* in the order that you write them in your .htaccess code. Only directives belonging to the same module will be executed in the order that you specify. All mod_alias directives will execute first, or all mod_rewrite directives will execute first, and nothing you can do in .htaccess can change that.

See Proper Order for htaccess directives [webmasterworld.com] in our Apache Forum Library for more information.

Jim