Forum Moderators: phranque

Message Too Old, No Replies

Redirect from file to folder with same name

Redirect, 301, htaccess

         

slot

3:04 pm on Sep 21, 2011 (gmt 0)

10+ Year Member



Hi, I need something (301, good for search engines) to make this kinda redirect:
All files in the site are going to be folders, and in another directory. So:

From www.site.com/folder/file.htm
To www.site.com/#*$!/folder/file/

And also
From www.site.com/folder1/folder2/file.htm
To www.site.com/#*$!/folder1/folder2/file/

Every file in every folder is going to www.site.com/#*$!/path.../file/

Is it possible to do that without redirecting every single url.

Hope i've been clear, it's not so easy to explain. :-)

Thank you

g1smd

3:21 pm on Sep 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, dreadfully easy. One line of code for each redirect set.

Pick any previous thread in this forum that has a "RewriteRule" and "[R=301,L]" in some post and substitute your URL fragments and you're good to go.

One line of code can take care of every file in a folder. You'll need two lines of code.

Post your code here for discussion.

Use example.com to prevent forum auto-linking.

slot

5:07 pm on Sep 21, 2011 (gmt 0)

10+ Year Member



Well, I'm new to this, and i've been looking for days, finding several different codes (i know i have to remove extension, add the slash and move to the new directory), but I'm gonna try anyway. ;-)

Whilever, being less than 100 files, I could redirect them one by one, but how can I set the home (www.site.com) to be redirected, without that rule to redirect also all the other files?

This wouldn't work, right?

Redirect permanent / [site.com...]
Redirect permanent /old.htm [site.com...]
Redirect permanent /folder/old.htm [site.com...]

Thanks a lot

g1smd

5:48 pm on Sep 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need
RewriteRule
, not Redirect. Redirect reappends path data. You don't want that.

Your RegEx pattern will be
^folder/([^/.]+)\.htm
without end anchor (as it will then also redirect requests for .html too).

The rule target will then be
http://www.example.com/something/folder/$1/
reusing the backreference created from the RegEx pattern.

Finish with the
[R=301,L]
flags.

Your original example had three levels of folder, the next post only two. You need to be EXACT about the format.

You'll need a rule for three levels, a rule for two levels, and a rule for root.

Use example.com in this forum to prevent URL auto-linking. We need to see the CODE, not a link.

slot

2:04 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



Thanks a lot, I'm reading now from the library, trying to learn it ;-)

Here is a summary of the structure, with every instance:

www.example.com/ index and another single htm file

www.example.com/folder1/ index and htm files, no subfolders

www.example.com/folder2/ index, htm files and subfolders
www.example.com/folder2/sub-a/ only index
www.example.com/folder2/sub-b/ index and htm files
www.example.com/folder2/sub-c/ index, htm files and subfolders
www.example.com/folder2/sub-c/sub-sub-c/ index and htm files

Since I don’t want to put everything on the main root, I decided to put 3 htaccess files: one in root, one in folder1 and one in folder2.

Let’s start with first two, this is what I thought:

Main root:
RewriteEngine on
RewriteRule ^$ http://www.example.com/something/ [R=301,L]
redirect 301 /file.htm http://www.example.com/something/file/

(do I need something here for the index.htm file?)

Folder1:
RewriteEngine on
RewriteRule ^(.+)\.htm http://www.example.com/something/$1/ [R=301,L]

I was not sure if to use * or + after the dot.
Then I think I should exclude index.htm file not to redirect it to /index/, maybe with ^ but I don’t know where to put it. And maybe I should add another RewriteRule ^$.

Thanks for your help, and sorry for using site dot com before

g1smd

4:15 pm on Sep 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do not use Redirect or RedirectMatch at all.

Use RewriteRule for all of your rules.

You need to be aware that the root .htaccess file is processed first, and any URL that matches the pattern will trigger the rules there.

This means that if you decide to have some rules in htaccess files in folders the rules in the root htaccess file will need to be "aware" of the folder URLs and not process those URL requests.

In most cases it is much easier to simply put all the rules in one htaccess file in the root. Order those rules from most specific (affects the least number of URLs) to most general (affects the most number of URLs). The last redirect is usually the standard non-www to www canonicalisation ruleset.

If multiple rules are invoked you will end up with an unwanted redirection chain.

slot

5:09 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



Yes, I know that, but I'd prefer to put the rules in subfolders in case in the future I want to put again some htm files.

g1smd

5:33 pm on Sep 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Every RewriteRule in the root will need a preceding negative match RewriteCond that excludes requests for folder1 or folder2 from matching any of the rules in the root htaccess file.

This is usually more work than putting all the rules in the root htaccess to begin with.

Remember, the rules are matching "URL requests" so you tailor the patterns in the root htaccess to match each type of requested URL.

Rather than ^pattern in the root and ^pattern in the folder, use ^folder/pattern and ^pattern in the root.

slot

5:48 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



But the rule I posted before for root:

RewriteEngine on
RewriteRule ^$ http://www.example.com/something/ [R=301,L]
redirect 301 /file.htm http://www.example.com/something/file/

wouldn't just affect the home and that single file, letting me free to work on the 2 folders?

slot

6:06 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



Sorry,

RewriteEngine on
RewriteRule ^$ http://www.example.com/something/ [R=301,L]
RewriteRule ^filename.htm$ http://www.example.com/something/newfilename/ [R=301,L]

g1smd

6:07 pm on Sep 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do not use Redirect or RedirectMatch at all. Use RewriteRule for all of your rules.

Those two rules affect only single URLs. The danger is you make one small change next year and introduce a redirection chain and that hampers spidering which leads to problems in indexing and then erodes your ranking without giving much of a clue as to what is going on.

Make sure you escape all literal periods in RegEx patterns.

slot

6:53 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



Sorry, I edited it in the following post.

I just thought it was better, cause in the new destination subdirectory there won't be every time a match request for those rules, and in the future I can always exclude folder1 and folder2 from new rules. Am I wrong?

g1smd

7:56 pm on Sep 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you put the folder rules in the root, a non-match will abort the processing of that rule after one character of the folder name has been compared to the requested URL. That would be quicker than excluding both folders from each rule.

slot

11:36 pm on Sep 25, 2011 (gmt 0)

10+ Year Member



Ok, I'm trying.

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^[.*]index\.htm
RewriteRule ^(.*)\.htm http://www.example.com/newdir/$1/ [R=301,L]
RewriteRule ^(.*)index\.htm http://www.example.com/newdir/$1 [R=301,L]

Am I on the right way?

Thank you again for patience :-)

g1smd

11:58 pm on Sep 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The (.*) pattern says "grab the entire requested URL right to the very end". The parser does that, but then you're telling it there's more characters after the end with "(.*)index". It has to initiate hundreds of "back off and retry" trial matches to find out what you really meant. The pattern fragment (.*) should never be used at the beginning or in the middle of a pattern.

As for your rules, list "most specific" first and "most general" last. This ensures the right rule runs for each specific request.

The pattern ^([^/.]+)\.html is useful to find the filename in the root.

The pattern ^(([^/]+/)*) is useful to find nested folder levels and ^(([^/]+/)*[^/.]+)\.html to find nested folders plus filename at any level (including root).

A RewriteCond applies to the single RewriteRule that follows. If a condition applies to multiple rules, you must replicate that condition for each and every rule that it applies to.

Add a blank line after each RewriteRule for clarity.

lucy24

12:46 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



!
That reminds me. Does the htaccess dialect of RegEx recognize the (.*?) or (.+?) formula? That is, "Take as little as you possibly can, keeping one eye open for the next piece of the pattern."

So that if you have {ab}{xyz}{cd}{xyz} and your pattern asks for (.+?){xyz}, you'll end up with
({ab})
rather than the default
({ab}{xyz}{cd})
?

slot

1:38 am on Sep 26, 2011 (gmt 0)

10+ Year Member



I'm getting lost.. :-(

RewriteEngine on

RewriteRule ^index\.htm http://www.example.com/newdir/ [R=301,L]

RewriteRule ^(([^/]+/)+)index\.htm http://www.example.com/newdir/$1/ [R=301,L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !^(([^/]+/)*)index\.htm
RewriteRule (([^/]+/)*[^/.]+)\.htm http://www.example.com/newdir/$1/ [R=301,L]

lucy24

2:17 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your first two rules are really the same, and can be collapsed to

RewriteRule ^(([^/.]+/)*)index\.htm http://www.example.com/newdir/$1 [R=301,L]

Leave off the final trailing slash; it's included in the capture and you would otherwise end up with // Include . in your brackets [^/.] to save a few nanoseconds. At the end of each search, it's the difference between picking up all of "index.htm" before having to backtrack, and stopping after "index".

RewriteCond %{REQUEST_URI} !^(([^/]+/)*)index\.htm

Here you don't need the beginning anchor with following stuff, since it's optional (the * says so) and you're not capturing it. All you're looking for is

RewriteCond %{REQUEST_URI} !index\.htm

which you don't need anyway, because the previous Rule picked up all requests that are for index.htm. So you're left with

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (([^/]+/)*[^/.]+)\.htm http://www.example.com/newdir/$1/ [R=301,L]

Are you sure? It says "take any request ending in .htm that isn't a valid filename, and redirect to the folder of the same name". How do you know that the named folder will exist?

Take a break, anyway. g1 is in a different time zone and has a day job, so he won't be around for at least a few hours ;)

slot

2:43 am on Sep 26, 2011 (gmt 0)

10+ Year Member



Thanks lucy, yeah, I'm definitely taking a break. :-)
With the final cond, I wanted (only wanted it seems) to exclude index files.

By the way, why is it better to use rewriterule instead of Redirect 301? (not having hundreds of files, that would be quicker for me to set a redirect for each, while I study rewrite rules).

lucy24

3:57 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With the final cond, I wanted (only wanted it seems) to exclude index files.

Ooh, right, you don't know whether the previously redirected files will be named index.htm when they pass through .htaccess again. It may depend on your server. I should know this, because I have learned from direct personal experience that mine appends the "index.html"-- if there is one-- to directory names at some earlier stage. You're right. Keep the index.htm

why is it better to use rewriterule instead of Redirect 301?

Because g1smd says so. Want to make something out of it? :)

Apache themselves have a list of times when you would not use mod_rewrite

:: shuffling papers ::

[httpd.apache.org...]

They list four situations, but the other three only apply at levels above .htaccess. So the one you can fight about is whether to say Redirect(Match) or do it all via mod_rewrite.

When a request moves through htaccess, it hits each module separately, no matter what order you've written your .htaccess in. They're done in the reverse order of installation, which is out of your control unless you have your own server. Generally this means reverse alphabetical order.

You can do some simple experiments to make sure. On mine, mod_setenvif runs before mod_rewrite which in turn runs before mod_alias. If you're on a truly antiquated setup, mod_access will probably run last of all-- which is handy, since its functionality has now been incorporated into the core.

You know how you're always being told to put the specific directives before the more general ones? If you use Redirect by that name in addition to Rewrite, you're in mod_alias. Since this normally runs after mod_rewrite, you're then going back to specific directives after you've done the general ones.

Sometimes it can work. I have a cluster of redirects that apply only to files within a particular directory. Unique names, no chance of ambiguity. To reduce chaos, I gave the directory its own htaccess rather than make all other incoming requests plow through rules that will never apply to them. For this htaccess, everything is done with Redirect or RedirectMatch. It saves time, and there aren't any before-or-after issues because the whole htaccess will run separately, after the top-level one.

Or rather: It saves time up until the moment I forget that plain Redirect, unlike Rewrite, does not use Regular Expressions, and the server flies into a rage looking for files with literal backslashes \ in their names.

g1smd

8:16 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you have an external redirect and an internal rewrite in your .htaccess file, you should use RewriteRule for both rules and list the external redirect before the internal rewrite.

If you list them in the reverse order you will expose the previously rewritten internal path back out on to the web as a new URL.

If you use Redirect instead of RewriteRule for the redirect, and mod_alias runs after mod_rewrite this will also expose the previously rewritten internal path back out on to the web as a new URL.

Example:
1. RewriteRule to redirect non-www URL requests to www.
2. RewriteRule to internally rewrite www.example.com/11/22 to /index.php?a=11&b=22.

If you were to reverse the rule order, and you request the non-www URL example.com/11/22 the rewrite rule internally rewrites /11/22 to /index.php?a=11&b=22 and then the redirect redirects to www.example.com/index.php?a=11&b=2

What you actually wanted was to redirect example.com/11/22 to www.example.com/11/22 and for the rewrite to silently fetch the content from /index.php?a=11&b=22 without revealing what that internal location was.

This is why the rule order when using RewriteRule is important.

Similar problems occur if you use both Redirect and RewriteRule and Redirect is run after RewriteRule. You have no control over this. Apache runs each module (mod_rewrite, mod_alias, mod_auth, etc) for each incoming request in an order specified deep in the Apache core configuration.

Use RewriteRule for all of your rules.
List rules that redirect before rules that rewrite.
List redirects in order from most specific to most general.
List rewrites in order from most specific to most general.

If you have any RewriteRules that deny access, list those first as there is no point redirecting a request only to then block it later on.

g1smd

8:51 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To reduce chaos, I gave the directory its own htaccess rather than make all other incoming requests plow through rules that will never apply to them. For this htaccess, everything is done with Redirect or RedirectMatch. It saves time, and there aren't any before-or-after issues because the whole htaccess will run separately, after the top-level one.

The incoming URL request will be serviced by one Apache module after another. If mod_Rewrite runs before mod_Alias you have not avoided the problem. Mod_rewrite will service all of the .htaccess files, root first, then deeper folders, and mod_alias will then come along and do the same. If there is a mod_rewrite rule in the root htaccess that matches the current request then you're sunk. Using only one .htaccess file and sticking only to mod_rewrite is the only way to be absolutely sure there will be no problems.

lucy24

9:59 am on Sep 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ouch. I didn't realize that. Can I fall back on "to reduce my own personal confusion"? ;)

Currently I've only got one true rewrite (as opposed to redirect via mod_rewrite), and that's for hotlinking. It only recently occurred to me to wonder why it doesn't create an infinite loop, since it rewrites all gif/jpg/png requests to a png file. I can only assume that somewhere along the line it picks up either my domain or "nothing" as referer, allowing it to bypass the rule the next time around.

I also started wondering why everyone is able to see robots.txt (via <Files> ) even if they've already been ordered to drop dead [F]. Possibly this small group of robots has never asked to see robots.txt, so the situation doesn't arise. (There aren't many. Most people are locked out with core-level Deny from.)

It may or may not be better not to wonder about these things.

Oh, and I do get the 301-plus-403 package. But only because I've allowed my host to do the with-www redirecting, so it happens before they reach my htaccess. One less thing to think about.