Forum Moderators: phranque

Message Too Old, No Replies

how to preserve the directory path

         

gingo

7:31 pm on Jan 22, 2008 (gmt 0)

10+ Year Member



Hi,

I have to rename all files of a directory and their subdirectories as follow:

cont_page.html -> page.html

in short I have to remove the string "cont_" from each file.

In order to preserv PR, link, and so on I'm doing a 301 redirect. The problem is that I cannot redirect to the right directory.

Let's say "mydir" the directory where all files to rename are stored, thus I put in \mydir the following htaccess:

-----------
RewriteEngine on
RewriteRule ^cont_(.*).html$ $1.html [R=301,L]
-----------

It works fine if the file is in \mydir , but not for the files in its subdirectories like

\mydir\sub1
\mydir\sub2

If I test the rewriting on an url like \mydir\sub1, the rewrite engine redirects to \mydir e not to \mydir\sub1, for example:

www.mytesstssite.com\mydir\sub1\cont_page.html

is rewritten to

www.mytesstssite.com\mydir\page.html

and not to www.mytesstssite.com\mydir\sub1\page.html

I have tried all, but I got no result.
Only if I put RewriteBase \mydir\sub1\ works, but it is not a solution, because then waht for \mydir\sub2?

Any suggestion? thank you.

jdMorgan

10:18 pm on Jan 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Create and use another back-reference to the (optional) directory-path:

RewriteRule ^(([^/]+/)*)cont_([^.]+)\.html$ http://www.example.com/$1$3.html [R=301,L]

Note that I tweaked your original ".*" pattern. Don't use ".*" unless you have no other choice; It is an ambiguous, promiscuous, greedy and inefficient pattern, and often causes unexpected problems. The two new subpatterns read "Match everything up until you find a slash (and as many of those sequences as you like, including zero), and "Match everything until you find a period."

Two levels of parenthesis are needed for the first subpattern, otherwise $1 would contain only the last directory-path-part matched.

To determine back-reference numbers, count left parenthesis.

Jim

gingo

11:03 pm on Jan 22, 2008 (gmt 0)

10+ Year Member



Thank you for your help, but it seems not working.

When I type in the browser the address:

www.example.com\mydir\cont_page.html

I get redirected to

www.example.com\page.html

getting in this way a 404 error.

Correct would have been: www.example.com\mydir\page.html

I do not understand very well the syntax $1$3 (why is missing $2?)

jdMorgan

12:20 am on Jan 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Completely flush your browser cache and re-test.

Jim

gingo

7:40 am on Jan 23, 2008 (gmt 0)

10+ Year Member



Sorry, I get always the same error 404.
I deleted the cache, I tried different browser too (IE, Firefox, Opera)

gingo

8:44 am on Jan 23, 2008 (gmt 0)

10+ Year Member



I made a new test with your rewrite rule and remarked that the rule redirects the url one level up, but not in the same one as I'm trying to do.

if I type in the browser

www.example.com/mydir/dir1/cont_page.html

I am redirected to

www.example.com/mydir/page.html

(i.e. "_cont" is removed, but I am one level up)

what I would like to get is

www.example.com/mydir/dir1/page.html

Again if I type:
www.example.com/mydir/dir1/dir2/cont_page.html

I get:
www.example.com/mydir/dir1/page.html
i.e always one level up, never at the same level.

even if I do

RewriteRule ^(.*)cont_(.*).html$ http://www.example.com/$1$2.html [R=301,L]

I get the same result as your rule, i.e. I am redirected always one lepel up, but never at the same level...

(I know you said do not use (.*), but I want just to try other ways)

jdMorgan

1:57 pm on Jan 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Where did you install the code?

It has to be installed in the top-level .htaccess file, otherwise it won't 'see' any directory levels above the directory where it is installed.

Alternatively, your server may be configured in such a way that the top-level path info is not accessible. In these cases, the RewriteBase directive is often used, but I don't think it will help in this case.

Jim

gingo

2:57 pm on Jan 23, 2008 (gmt 0)

10+ Year Member



I have installed the htaccess file in the directory mydir, not at top level, because I want that the action take place only for their directory mydir and its subdirectories and not for all my website.

If I put the code at the top level, then the redirect will be active for all pages.

Would be possible to set any condition in order to execute the redirect only in \mydir and its subdirectories?

Or maybe alternatively would be possible to set a condition in order to execute the redirect only if hte file cont_anypage.html is missing?

Thanks

jdMorgan

3:02 pm on Jan 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In .htaccess, the path to the current directory is 'stripped off' from the URL 'seen' by mod_rewrite. So if the code is in /mydir, and you want to redirect to /mydir, then that path-part must appear in the substitution URL:

RewriteRule ^(([^/]+/)*)cont_([^.]+)\.html$ http://www.example.com/[b]mydir[/b]/$1$3.html [R=301,L]

Jim

gingo

3:20 pm on Jan 23, 2008 (gmt 0)

10+ Year Member



Wow! It works! Thank you so much!

This I was missing: "In .htaccess, the path to the current directory is 'stripped off' from the URL 'seen' by mod_rewrite."
I have just learned an important issue on mod_rewrite.

Thank you again.

Just a last question, a bit different, I do not know if I shall open a new thread.

Would be possible to apply the URL-rewrite only to a non-existing files?
I'm just renaming all my files from "cont_page.html" -> to "page.thml".
So at moment I have a mix of pages with and without the prefix "cont_".
I would like to apply the rewrite only to the missing file (i.e. the already renamed files)

I'm just studying RewriteCond and think I should use something like:

RewriteCond %{REQUEST_URI} "is missing"

but I cannot write the condition "is missing"

Any suggestion?

jdMorgan

4:46 pm on Jan 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would strongly recommend reviewing the mod_rewrite documentation [httpd.apache.org] and the URL rewriting guide on the Apache Web site. See specifically the RewriteCond directive, and note that URLs always "exist" -- You can type anything you like into your browser, after all. So you want to check %{REQUEST_FILENAME} to be sure that the requested URL resolves to an existing file (or that it does not resolve to an existing file, depending on how you want to code it).

Jim

gingo

5:21 pm on Jan 23, 2008 (gmt 0)

10+ Year Member



Yes, I was exactly reading the site on mod_rewrite that you quoted, when you replied to me. I remarked my mistake and I wanted to edit my previous message with a %{REQUEST_FILENAME}, but then I found your reply :-)
Thank you for the help.

As I understand I have to use the condition:

RewriteCond %{REQUEST_FILENAME}!-f