Forum Moderators: phranque

Message Too Old, No Replies

newbie .htaccess problem - external redirect

problem with external redirect

         

kika

3:30 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



I've been reading posts in the forum for the past two days, and while I learnt a lot about htaccess (I'm still a newbie), I still have a problem with my external redirects and I cannot find the solution in the previous posts (I hope I didn't miss the correct answer).

My problem his: I have a site that was previously hosted at www.domain.com and everything must be redirected to the new domain www.exemple.com.

The new site is managed with a different CMS, and therefore has a completely different url pattern. It's some 600 articles and 15 different sections, so I'm trying to find an easy solution to redirect people at least to the right section, if not the right page on the new website.

I need to do the following redirects (I don't need any internal rewrite):

1) Redirect a couple of specific articles to the exact page on the new domain:

/cms/article.asp?article=01 --> http://www.example.com/section/article/new-title

/cms/article.asp?article=02 --> http://www.example.com/section/article/new-title-2

2) Redirect all the old archive pages to the new ones:

/cms/archive.asp?m=xyz --> http://www.example.com/archive/xyz

3) Redirect every other URL to the new index page

/cms/article.asp?article=whatever --> http://www.example.com/new-index

After two days of trial and error here is my .htaccess file:


Options +FollowSymlinks
RewriteEngine on

RewriteCond %{REQUEST_URI} ^/cms/article\.asp$

RewriteCond %{QUERY_STRING} ^article=01$
RewriteRule ^(.*)$ http://www.example.com/magazine/my-page-01/?

RewriteCond %{QUERY_STRING} ^articolo=02$
RewriteRule ^(.*)$ http://www.example.com/magazine/my-page-02/? [L]

RewriteCond %{REQUEST_URI} ^/cms/archive\.asp(.*)
RewriteRule ^(.*)$ http://www.example\.com/cms/archive? [R,NC,L]

RewriteCond %{REQUEST_URI} ^(.*)
RewriteRule ^(.*)$ http://www.example\.com/magazine/new-index? [R,NC,L]

Following all the tips posted before and a couple of tutorial I tried to put the most specific rules first, and the least specific (sort of catch-all rules) at the end, but I cannot get the last rule to work. In I comment it, everything works just fine, otherwise the previous rules are skipped and every page is redirected to the new index. What am I doing wrong? Is it a rule order mistake or I'm just doing it the wrong way? I'm not looking for a quick solution, I'm just trying to understand how it works!

Thanks to everyone who will answer this post.

--edit -- if I don't comment the last two lines, specific articles rules are skipped, but the archive redirect works

[edited by: jdMorgan at 5:29 pm (utc) on Jan. 23, 2009]
[edit reason] exAmple.com [/edit]

jdMorgan

5:47 pm on Jan 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What you are probably not taking into account is that an external redirect terminates the current HTTP request after sending a response back the client that says, "That content has moved. Re-request it from this new URL."

So, that ends the current request, the server forgets all about it, and it is up to the client to take the new URL from the server's redirect response and ask for the content again.

So, when this new HTTP request comes in, if you redirect *all* URL requests (as you last rule does), then of course this new request will be redirected as well.

Let's fix this, and clean up the code a bit, too:


Options +FollowSymLinks
RewriteEngine on
#
RewriteCond %{QUERY_STRING} ^article=01$
RewriteRule ^cms/article\.asp$ http://www.example.com/magazine/my-page-01/? [R=301,L]
#
RewriteCond %{QUERY_STRING} ^articolo=02$
RewriteRule .* http://www.example.com/magazine/my-page-02/? [R=301,L]
#
RewriteRule ^cms/archive\.asp$ http://www.example.com/cms/archive? [NC,R=301,L]
#
RewriteCond $1 !^magazine/new-index$
RewriteCond $1 !^robots\.txt$
RewriteCond $1 !^sitemap\.xml$
RewriteRule (.*) http://www.example.com/magazine/new-index? [R=301,L]

The key here is that the last rule now makes sure that the request is not already asking for "/magazine/new-index" before it redirects, so the redirect looping is eliminated. I also took the liberty of excluding some URLs which usually *must not* be redirected, such as robots.txt and sitemap files. This list of exclusions is likely incomplete; Be aware that as shown, *all* requests will be redirected to the "new index" page, including requests for images, CSS stylesheets, external JavaScripts, media and PDF files, etc... likely not what you want. You will need to mentally 'survey' all of your URLs and decide what kind of URLs you due and do not want to redirect.

Jim

kika

6:38 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



First of all: thank you Jim for your quick reply and for the explanation, you're really making me understand what's going on.

I tried the solution you suggested, but unfortunately it still ignores the first 3 RewriteRule and only seems to accept the final one. Now even the "cms/archive" redirect doesn't work. Every URL simply redirects to the the new index page.

I tried again and apparently I need to add the following conditional if I want the archive RewriteRule to work properly:


RewriteCond %{REQUEST_URI} ^/cms/archive\.asp(.*)

Probably I'm just tired and I cannot see some self-evident mistake, but the 2 first conditional still don't work...

Thanks also for the tips about which files I should/shouldn't include. As soon as I fix the redirect issue I'll focus on the rest. The problem is that I "inherited" this website, and the old one has been completely deleted from the old server (not my fault) without any notice and I had to put online the new site without enough time to plan this sort of things. However now users who follow old links or bookmarks can only see the new site. I don't mind serving them a new single custom page where they can find the explanation, but I would like to learn how to handle this kind of situation in the future.

g1smd

7:08 pm on Jan 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not that on the earlier examples, R is changed to R=301 and that every rule has [L] added to the end.

The most important part is redirects first, rewrites last.

.

For your last question, what prevents it looping? You do need [L] on the end of the rule here.

kika

11:08 am on Jan 24, 2009 (gmt 0)

10+ Year Member



Well, after a good night's sleep I can now see the point.

I had to make some minor changes to the condition and rewrite rules for the single articles, but the final 3 lines suggested by jim and the final [L]s added at the en of the rules fixed the problem. No more looping and now my htaccess works like a charm.

Thanks again :-)

g1smd

7:05 pm on Jan 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"Not" should have read "Note"...

jdMorgan

7:21 pm on Jan 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



kika,

If you mean you did this:


RewriteCond %{REQUEST_URI} ^/cms/archive\.asp(.*)
RewriteRule ^cms/archive\.asp$ http://www.example.com/cms/archive? [NC,R=301,L]

then be aware that RewriteRule directly tests a derivative of the REQUEST_URI (the leading slash is removed, since it is the path the .htaccess file's directory). As a result, the RewriteCond here is redundant, and should not be needed at all.

Jim