Forum Moderators: phranque

Message Too Old, No Replies

Redirecting a group of urls?

         

ken_b

10:46 pm on Sep 24, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Please note: I've looked in the library here and searched the web and looked a apache.org (using links from the library here at WW)and can't find what I'm asking about.

I've done some redirects of single urls that work fine.

Now I have a situation where I want to redirect a "group" of urls from

example.com/somefolder/1a thru 1z.htm

to

example2.com/somefolder/1a thru 1z,htm

1a thru 1z represent 26 existing urls on the old site and will represent 26 urls on the new site.

So.... is there a way to do this with 1 redirect? Or do I need to write 26 redirects?

I'm under the impression I can't use a wildcard redirect because there are other pages in the originating folder that are either not be redirected/moved or are already redirected to example3.com

Just pointing me to a tutorial on this would be great.

ken_b

12:28 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, digging around some more, I think I may have found the answer in this old post [webmasterworld.com] from 2007 by jdMorgan
Well, for best portability, I'd recommend you type the directives exactly as they appear in the Apache mod_alias documentation, but otherwise they're correct.

Redirect 301 /the_old_page.htm http://example.com/the-new-page.htm
Redirect 301 /the_old_page2.htm http://example.com/the-new-page2.htm
Redirect 301 /the_old_page3.htm http://example.com/the-new-page3.htm

Now, I mentioned that you may not need 50 rules. Here's an example, treating your example urls as if they were literally the real urls:

RedirectMatch 301 ^/the_old_([^.]+)\.htm$ http://example.com/the-new-$1.htm

That will redirect *all* urls of the form "/the_old_<something>.htm" to "/the-new-<something>.htm" with only one directive instead of fifty.

If your old urls have such commonality, you can take advantage of it to reduce the number of redirect directives you need to write, test, and maintain.

Jim


So I think that mean I could do this like this

RedirectMatch 301 ^/1([^.]+)\.htm$ http://example.com/1$1.htm

is that right?

Would need to put Rewrite engine on above that in my htaccess?

And would this go in the main htaccess in the old site root folder or in an htaccess file in the old folder where the old pages reside now?

Or am I totally confused?

SevenCubed

12:57 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm going to step way out of my comfort zone and attempt to answer this but be forewarned that if lucy24 comes along and disagrees vigorously with my solution, she's right and I'm wrong, and if g1smd comes along and strongly disagrees then we're probably both wrong. The only one that can override g1smd is jdMorgan.

I have not tested this but I think this is how you would use it in .htaccess:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^oldexample.com/$1 [NC]
RewriteRule ^(.*)$ [newexample.com...] [R=301,L]

And just note that I added an "s" above just so it will override the forum auto-URI from being added. https://www.newexample.com/$1 -- do not add the "s" to your code. And you would add this into the root of the old site.

g1smd

1:12 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



HTTP_HOST can match only the host name. Remove the /$1 here. Escape literal periods.

With the start anchor on the HTTP_HOST test, www. versions of the old URL will NOT be redireced. Perhaps change to
^(www\.)?example\.com
if you do want those redirected. In fact, the condition is probably not needed at all.

Make the pattern "on the left" of the rule match the URL path you want to capture. If "1a" and "1z" are literal, then
^folder/(1[a-z])\.htm$
is it.

There's literally thousands of prior posts showing redirects with RegEx patterns here. Do not use Redirect or RedirectMatch. Use only RewriteRule.

SevenCubed

1:29 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok I was warm, I'm going to play with this on my dev machine based on g1smd's feedback and come back and post my result.

phranque

1:33 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



in honor of your 5000th post i'll give you an answer instead of a tutorial.
=8)

something like this might do it for you:

RewriteRule ^(somefolder/1[a-z]\.htm)$ http://example2.com/$1 [R=301,L]

SevenCubed

2:14 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Alright, I see phranque posted an answer but I did get it working with something different than what he offered so I'll just throw it in as a variation:

RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L]

^ Same thing applies here don't add the "s" for yours -- I added it to prevent the auto link.

lucy24

6:03 am on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm under the impression I can't use a wildcard redirect because there are other pages in the originating folder that are either not be redirected/moved or are already redirected to example3.com

Pages that have already been redirected are not a problem, because, well, they have already been redirected. So if their rule comes first, they'll never see the rule meant for the surviving URLs. The not-to-be-redirected pages are a bigger problem. Is there any kind of pattern to the naming? Are there anything like 26 pages staying behind, or only a few?

RedirectMatch 301 ^/1([^.]+)\.htm$ http://example.com/1$1.htm

is that right?

It is if you're matching a bunch of URLs that all have a literal number one at the beginning. And don't press me about the anchored slash, because I'd have to go look up the docs. I just know it would be wrong for htaccess and mod_rewrite.

Would need to put Rewrite engine on above that in my htaccess?

Would have no effect whatsoever, because RewriteEngine On is a mod_rewrite directive, and your rule uses mod_alias. But that is OK, because you will deduce from all the preceding posts that you are going to use mod_rewrite, not only for this redirect but for any others you've already got. (The folks at apache dot org are free to mix mod_alias and mod_rewrite. Us ordinary civilians shouldn't try.)

And would this go in the main htaccess in the old site root folder or in an htaccess file in the old folder where the old pages reside now?

They have to be in a place that will be "seen" by requests looking for the old, non-redirected URL. If you can put them in a place that will not be seen by the new, redirected versions, that's a bonus. Since you're dealing with entirely different domains, you should try to grab the requests as soon as possible. That means putting the htaccess in the root of the old domain. Presumably you've already got one there.

Or am I totally confused?

You're asking me?

ken_b

3:59 pm on Sep 25, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for the guidance folks.

I've got a couple of unused websites I'm going to use to try this out before I try it on the actual target sites.

I'll report back on the results.

ken_b

1:52 am on Sep 26, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Works great on my test sites. Now I'm just waiting for a domain transfer to finish up before I apply this to the actual target websites. One thing at a time.

Thanks a whole bunch folks.

The not-to-be-redirected pages are a bigger problem. Is there any kind of pattern to the naming? Are there anything like 26 pages staying behind, or only a few?

The "left behind" pages are only a couple, and are named completely differently, very content specific file names.

If "1a" and "1z"

Yes, there are 26 pages name 1a, 1b, 1c, on thru 1z.

ken_b

2:02 am on Sep 26, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok this makes wonder about the order of things in my htaccess file
Pages that have already been redirected are not a problem, because, well, they have already been redirected. So if their rule comes first,

Because these 26 pages all start with the number 1, there are the first in order in the directory folder, so....
should the redirect for them be listed forst in the htaccess file also?

OK, I found a post [webmasterworld.com] that might apply
go from most specific to most general. For example, individual page redirects go before index.html redirects which in turn go before the final with/without www. redirect.


So in my case the index.htm is not being redirected.

But is the redirect for the 26 page group considered more or less specific than a redirect for an individual page?

phranque

6:20 am on Sep 26, 2012 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the "26 page group" applies to precisely 26 pages.

i haven't seen your default directory index document redirect but that could potentially apply to an infinite number of directories - it should be more general than the 26er.