Forum Moderators: phranque

Message Too Old, No Replies

.htaccess Rewriting Broken Inbound Permalinks

Moved Blog Now Rewriting Broken Inbound Permalinks in .htaccess File

         

pcdaugs

10:13 pm on Oct 20, 2011 (gmt 0)

10+ Year Member



I moved my wordpress blog from the root to a sub-directory /blog and created the .htaccess code below to redirect the broken inbound permalinks. Now I am moving it to a subdomain blog.theenchantedimage.com and I have two problems:
1) The code at the bottom works for my directory of my root /blog with the broken links but it won't work on the subdomain blog.theenchantedimage.com
2) The code below is also redirecting anyone that goes to my root to the blog and I want to make sure in the future that won't happen as I will be hosting a website on the root.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^theenchantedimage\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.theenchantedimage\.com$ [NC]
RewriteCond %{REQUEST_URI} !^/\blog/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /\blog/$1
RewriteCond %{HTTP_HOST} ^(www.)theenchantedimage.com$ [NC]
RewriteRule ^(/)?$ blog/index.php [L]


I was thinking that I needed to use the
<IfModule mod_rewrite.c>
to avoid searches going to the website I will be hosting on the root directory but I haven't been able to figure this out. Any help would be greatly appreciated as I am have been pulling my hair out trying to figure this one out and it just doesn't matter how many articles I read I can't get it to work. Also, since my blog was originally hosted on my root I have to have the .htaccess file in the root directory of my FTP to handle any of the broken links.

Thanks In Advanced!

lucy24

11:31 pm on Oct 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Slightly aside, but you can use the knowledge sooner or later:

RewriteCond %{HTTP_HOST} ^theenchantedimage\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.theenchantedimage\.com$ [NC]


Read up on Duplicate Content in these Forums. Pick one form or the other and redirect to it. In any case the two lines can be combined to

RewriteCond %{HTTP_HOST} ^(www\.)?theenchantedimage\.com$ [NC]


Do escape literal periods \. Do not escape slashes / (It may not make any difference, but it's only necessary in a few specific languages like javascript. In htaccess, try to shave every possible byte.)

In the second Rewrite, why is (www\.) in parentheses? You're not capturing it for later use. Did you forget a question mark? You probably meant (www\.)? as above.

The form
^(.*)$

with anchors is never necessary. By default, RegEx will start at the beginning and go on to the end. You only need anchors when you're looking for some specific beginning and/or ending text.

You almost never need the <IfModule blahblah> statements. That's boilerplate from generic htaccess. Once you're in some specific location, you either have the module or you don't.

g1smd

11:56 pm on Oct 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why are you escaping the b of blog as \blog here?

You cannot simply move the .htaccess code from one place to another. It has to be redesigned.

Root of old site needs to redirect to subdomain (for now) not to folder.

Folder URLs need to redirect to subdomain.

The rewrite to actually serve the content needs to be in the blog subdomain root.


The code below is also redirecting

There are no redirects there at all, only RewriteRule configured as an internal rewrite.

The [L] flag needs to be added to every rule.

lucy24

1:43 am on Oct 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Why are you escaping the b of blog as \blog here?

:: insert smart-aleck remarks ad lib about people who never, ever make typos ::

He may have typed /\ but he meant \/ (escaping the slash). You can see from my post above that in fact I misread it that way-- but a computer, not having a brain, never misreads. In .htaccess this can be lethal.

pcdaugs

3:25 am on Oct 21, 2011 (gmt 0)

10+ Year Member



Evening,

I appreciate your help in pointing out some of the issues with the way my code was written and I will certainly make use of your comments. Please understand that I am not a pro at this and I am relying on the java and vba programing knowledge I have learned in the past. So sometimes I don't use the correct language to describe what I am trying to do, thus the use of redirect vs rewrite.

What I need help with is to get the correct code within my .htaccess file to rewrite previously created blog posts that have inbound links to them that are broken because of the move of my blog. I do not understand what the correct programing should be to get the broken inbound links to rewrite correctly to the subdomain. Any help you all can provide how I would write the correct rewrite code would be greatly appreciated.

Thanks Again

lucy24

5:26 am on Oct 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Didn't I just say... Oops, no, sorry, different thread.

Before anything else, you must must must get a grip on the difference between a Rewrite and a Redirect. They can both be achieved using mod_rewrite and they look almost identical in your .htaccess, but they do very different things.

A Redirect means (in English) "You don't want to be here, you want to be over there" leading to a change in the user's address bar. And then carry on as if this is what they had asked for in the first place.

It looks like this:
RewriteRule {blahblah} http://www.example.com/{otherblahblah} [R=301,L]


A Rewrite means (in English) "You may think you're here, but you're really over here" with no change in the address bar.

It looks like this:
RewriteRule {blahblah} {otherblahblah} [L]


A special kind of rewrite that everyone has met is the error page: your browser's address bar will show the place you thought you were going to, but the screen says something like "Lissen, dimwit, there ain't no such page".

So before you start attacking your htaccess you need to be crystal clear on what you want to do.

#1 take incoming requests for nonexistent addresses, and REDIRECT them to the right place (using information from the URL and/or query in the original request).

OR

#2 take those same requests, let the user think they're going there, but then secretly REWRITE them by supplying content from some other place entirely. If you do this, you also have to fine-tooth-comb your target pages to make sure there are no relative links, because the user's browser will ask for stylesheets and so on based on where they think they are, not where they really are.

Once you've got that sorted, it's down to the mechanics. That means spelling out what your incoming request looks like-- using example.com so the links don't get obfuscated-- and where you want the user to end up-- ditto.

pcdaugs

2:11 pm on Oct 21, 2011 (gmt 0)

10+ Year Member



Hello lucy24,

Thank You! for all your help! Now I have a clear understanding of where to start with the coding. Your first option of taking incoming requests for nonexistent addresses, and REDIRECT them to the right place (using information from the URL and/or query in the original request). Is what I am looking to do.

Here is an example of what the two different links look like that are going to be coming in.

1) www.example.com/YYYY/MM/DD/Post-Name/
2) www.example.com/blog/YYYY/MM/DD/Post-Name/

I need to REDIRECT these cases to the following construction:

www.blog.example/YYYY/MM/DD/Post-Name/

If I understood the previous concepts as well I think this is the general idea of what I am trying to do. Is that correct? Also, I am not sure what to put in for the "blahblah" parts of the redirect.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?theenchantedimage\.com$ [NC]
RewriteCond %{REQUEST_URI} !^\/blog/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule {blahblah} http://www.blog.theenchantedimage.com/{otherblahblah} [R=301,L]

pcdaugs

3:40 pm on Oct 21, 2011 (gmt 0)

10+ Year Member



I just realized that the url construction I need to REDIRECT to missed the .com Below you will find the correct version.

www.blog.example.com/YYYY/MM/DD/Post-Name/

Thanks Again

lucy24

8:51 pm on Oct 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Here is an example of what the two different links look like that are going to be coming in.

1) www.example.com/YYYY/MM/DD/Post-Name/
2) www.example.com/blog/YYYY/MM/DD/Post-Name/

I need to REDIRECT these cases to the following construction:

www.blog.example.com/YYYY/MM/DD/Post-Name/

...

RewriteCond %{REQUEST_URI} !^/blog/

But wait! You are redirecting some requests in this form. The REQUEST_URI is what comes after the domain name.

RewriteCond %{HTTP_HOST} ^(www\.)?olddomain\.com$ [NC]
RewriteRule ^(?:blog/)?(\d\d\d\d/\d\d/\d\d/postname/){morestuff} http://www.example.com/$1 [R=301,L]


Note the {morestuff}. Can anything come after the Post-Name-plus-slash? Will it always be the same? Do you need to keep it? mod_rewrite unlike mod_alias doesn't reattach the leftovers (the rest of the path). What you see is what you get.

The ?: before "blog" in the pattern means "don't capture this". (Disclaimer: It works in my htaccess using Apache 2.2.something. I don't know if it works in all Apache installations. It's pretty standard RegEx, though.) If you don't put it in, then "blog/" becomes $1-- even if it doesn't exist-- and then the target has to say $2.

pcdaugs

9:50 pm on Oct 21, 2011 (gmt 0)

10+ Year Member



Hello lucy24,

Note the {morestuff}. Can anything come after the Post-Name-plus-slash?


No, there isn't any more stuff coming after the /postname/.

Will it always be the same?


Yes, this format will always be the same for inbound links that I need to redirect.

RewriteCond %{HTTP_HOST} ^(www\.)?olddomain\.com$ [NC]
RewriteRule ^(?:blog/)?(\d\d\d\d/\d\d/\d\d/postname/){morestuff} http://www.example.com/$1 [R=301,L]


Does the /postname/ in the RewriteRule a variable reference or do I need to come up with a regular expression to represent the /postname/?

Also does in the RewriteRule you've constructed here need to have a change to the domain from
http://www.example.com/$1
to
http://www.blog.example.com/$1
?


Lastly, apart from the questions above I assume that below is still the construction of the entire code.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?theenchantedimage\.com$ [NC]
RewriteCond %{REQUEST_URI} !^/blog/
RewriteCond %{HTTP_HOST} ^(www\.)?olddomain\.com$ [NC]
RewriteRule ^(?:blog/)?(\d\d\d\d/\d\d/\d\d/postname/){morestuff} http://www.example.com/$1 [R=301,L]


Thanks for all the help I really appreciate all of this and I got a feeling that I will lick this .htaccess stuff soon enough.

g1smd

10:33 pm on Oct 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{REQUEST_URI} !^/blog/ -- DO NOT redirect if path begins "/blog/"
RewriteRule ^(?:blog/) -- DO redirect if path begins "/blog/"


Which pattern is correct?

lucy24

1:05 am on Oct 22, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which pattern is correct?

I decided that English trumps Apache ;)

Here is an example of what the two different links look like that are going to be coming in.

1) www.example.com/YYYY/MM/DD/Post-Name/
2) www.example.com/blog/YYYY/MM/DD/Post-Name/


What you're capturing is the whole string: date plus postname. Oops, sorry, I see where we went astray. I shouldn't have said "postname" as if it's literal text. Make that

(\d\d\d\d/\d\d/\d\d/[^/]+/)

Meaning: first the date (thankfully you're keeping it in the same format!) and then whatever the name of the post is. [^/]+ is the most all-encompassing form: it means anything and everything up to the next directory slash.

And then where I said http://www.example.com that's where you put in your new subdomain name.

pcdaugs

4:45 am on Oct 22, 2011 (gmt 0)

10+ Year Member



Thank You Thank You Thank You!

I finally have a working knowledge of how to read this programing and the .htaccess file works Great!. I have one last questions for you in light of g1smd's post tonight.

RewriteCond %{REQUEST_URI} !^/blog/ -- DO NOT redirect if path begins "/blog/"
RewriteRule ^(?:blog/) -- DO redirect if path begins "/blog/"


I saw the English here stating that the {REQUEST_URI} means DO NOT redirect if path begins with /blog/. Will this RewriteCond work with in each of the following paths?

1) www.theenchantedimage.com/YYYY/MM/DD/PostName/
2) www.theenchantedimage.com/blog/YYYY/MM/DD/PostName/

I have tested the domain root without the /blog/ but I have not been able to test it with the /blog/ included. Your thoughts?

Otherwise, you both have been very helpful and I will see what I can do to contribute to the forum myself.

Thanks Again!

lucy24

4:55 am on Oct 22, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I saw the English here stating that the {REQUEST_URI} means DO NOT redirect if path begins with /blog/. Will this RewriteCond work with in each of the following paths?

1) www.theenchantedimage.com/YYYY/MM/DD/PostName/
2) www.theenchantedimage.com/blog/YYYY/MM/DD/PostName/

You don't need the Cond. In fact I couldn't figure out what it was doing there unless you misunderstood the URI part and thought you needed to exclude requests for the subdomain blog.blahblah.

The (?:blog/)? element is designed to make a single Rule work both with and without the leading blog/ component.

Remember that you don't have to wait for real links. You can type anything into your browser's address bar and the htaccess will do its stuff. It doesn't matter if it points you to a nonexistent page. Your address bar will show you the results of any redirects, even if the browser window (including title) shows your 404 page.

pcdaugs

5:12 am on Oct 22, 2011 (gmt 0)

10+ Year Member



lucu24, you have been awesome to work with. I consider this case closed I will use the URI part once I move my blog tomorrow from /blog/ to blog.theenchantedimage.com. As I won't want any redirects to be sent to that directory. Thanks again for all of the help!

pcdaugs

3:53 am on Oct 23, 2011 (gmt 0)

10+ Year Member



lucy24,

Well I changed over my blog to the subdomain and most of the inbound links work with the htaccess file below but not all of them. The path structure it is having trouble with is the following:

www.theenchantedimage.com/blog/YYYY/MM/DD/PostName/

Anyone of these links do not seem to be falling into the RewriteCond's in the htaccess file below. Could you take a look and see what you think.

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{HTTP_HOST} ^(www\.)?theenchantedimage\.com$ [NC]
RewriteRule ^(?:blog/)?(\d\d\d\d/\d\d/\d\d/[^/]+/) http://www.blog.theenchantedimage.com/$1 [R=301,L]


I thought I might need to add another {HTTP_HOST} ^(www\.)?theenchantedimage\.com/blog$ but that didn't do anything.

Your help would be appreciated.

Thanks Again

lucy24

5:04 am on Oct 23, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No, you can't put anything after a slash in the hostname.

:: racking brains because I'm overlooking something obvious ::

:: looking vaguely around for g1 ::

! idea !

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d


This pair of lines constrains the Rewrite to files or directories that do not exist. When you moved to the subdomain, did you move the files, or did you duplicate them (keeping the old ones in place in case of disaster)?

Safest and easiest test: comment-out both lines and see if anything changes.

The !-f and !-d conditions are rarely necessary. And they put a big extra load on the server: it's the computer equivalent of making the receptionist run upstairs and verify that the boss is really in a meeting before coming back to tell the visitor where to go. A carefully designed Rewrite will bypass this step because it will only affect files that you already know don't exist in their requested location.

pcdaugs

12:22 am on Oct 24, 2011 (gmt 0)

10+ Year Member



lucy24,

I did what you suggested and took out the following code.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d


The rest of the code still doesn't redirect the path that we discussed.

www.theenchantedimage.com/blog/YYYY/MM/DD/PostName/

Here is the code that I was running with still no results.

RewriteCond %{HTTP_HOST} ^(www\.)?theenchantedimage\.com$ [NC]
RewriteRule ^(?:blog/)?(\d\d\d\d/\d\d/\d\d/[^/]+/) http://www.blog.theenchantedimage.com/$1 [R=301,L]

pcdaugs

12:26 am on Oct 24, 2011 (gmt 0)

10+ Year Member



I just occurred to me that my subdomain is pointing at a sub folder of my public.html folder which the root directory is being hosted on. Does that make a difference?

Thanks,

lucy24

12:45 am on Oct 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you mean that files whose names contain /blog/ live in a different place from files whose names do not contain /blog/ then yes, it would definitely make a difference. I'd assumed that the "real" location corresponded to the url, so it's a simple matter of putting an optional directory in the search.

That is, the existing htaccess will work if the "real" directory for olddomain files contains
both
#1 the "real" directory for /blog/ files
and
#2 the htaccess file

if the "real" /blog/ directory lives somewhere else, then we need more information.

The location of the subdomain shouldn't matter, because that's where requests go after they have been redirected.

pcdaugs

2:48 am on Oct 24, 2011 (gmt 0)

10+ Year Member



lucy24,

Your solution was the answer. I put the following code into the htaccess file on the /blog directory and as well as the root direcotry and it all works well.

Thanks Again!

g1smd

5:41 pm on Oct 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Make sure there isn't a double redirect for any requests.

Try www and non-www requests, both with and without appended junk and/or parameters. Use the Live HTTP Headers extension for Firefox to check the results.

You do not need the
<ifModule>
container.