Forum Moderators: phranque

Message Too Old, No Replies

Rewrite directory and some files

         

wilderness

6:46 pm on Aug 22, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I realize it's absolutely PITIFUL (!) that after more than a decade of dealing with htaccess that I still don't grasp rewrites.

I've multiple former directories that have been eliminated. A few pages from some of the directories were retained, however moved UP on folder in the path.

The following directory contained 31 files. 29 have been eliminated.

What I wish is a Redirect for the the two (to the new path) and a 410 [G] for the other 29.

Is this correct?

RewriteCond %{REQUEST_URI} ^MySub/MySubSub2/SubSubSub3/SubSubSubSub4/PageOne.html [OR]
RewriteCond %{REQUEST_URI} ^MySub/MySubSub2/SubSubSub3/SubSubSubSub4/PageTwo.html
RewriteRule ^(.*)$ http://example.com/MySub/MySubSub2/SubSubSub3/$1 [R=301,L]

RewriteCond %{REQUEST_URI} ^/MySub/MySubSub/SubSubSub/
RewriteCond %{REQUEST_URI} !^MySub/MySubSub/SubSubSub/PageOne.html
RewriteCond %{REQUEST_URI} !^MySub/MySubSub/SubSubSub/PageTwo.html
RewriteRule - [G]

lucy24

7:55 pm on Aug 22, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are "MySub" "MySubSub2" etcetera actually the same? Then you can collapse the conditions:

MySub/MySubSub2/SubSubSub3/SubSubSubSub4/(PageOne|PageTwo)\.html

You, ahem, know perfectly well you're supposed to escape the literal period ;) though in this situation it's not a critical error.

But it doesn't need to be a condition at all. Just say

RewriteRule ^sub1/sub2/sub3/(page1|page2)\.html http://example.com et cetera 


You're also correct that in this specific case, the [R] rule needs to come before the [G] rule -- opposite the usual "list in order of severity" principle -- so you can intercept the pages that are to be redirected. But since they have already been intercepted, you don't need the !Pageone etc. exceptions in the [G] rule. Those requests will never get as far as this rule.

You forgot one part of the second rule (I suspect a cut-and-paste error when posting, but I'll spell it out for future reference). It goes

RewriteRule ^dir/subdir/subdir2/ - [G]


This rule, too, does not need any Conditions. As a general principle, never put anything in a Condition that can go in the body of the rule. This applies most often to positive (i.e. no leading !) %{REQUEST_URI} statements.

wilderness

8:04 pm on Aug 22, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Many thanks lucy.

I'm still puzzled i9n my comprehension of the anchor to catch the trailing name.

RewriteRule ^sub1/sub2/sub3/(page1|page2)\.html http://example.com et cetera


page1|page2 are two different pages, thus will Rewrite to two different pages.

wilderness

8:37 pm on Aug 22, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Many thanks.

Had a simple syntax error that was giving me fits, however got it working.

wilderness

9:06 pm on Aug 22, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy,
This part fails (doesn't pick up the requested file name and extension, then add to the end of the URL).

RewriteRule ^sub1/sub2/sub3/(page1|page2)\.html http://example.com et cetera


This portion is resolved by simply separating the lines into two, however I've quite a few similar 404's to correct and was hoping to find a way to grab the requested file name and extension with an anchor.

Does it require a ? to catch the string?

lucy24

1:00 am on Aug 23, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I hope you are not using the literal text "et cetera" in your rule :) I just put that in because I couldn't see your original post while I was typing.

RewriteRule ^MySub/MySubSub2/SubSubSub3/SubSubSubSub4/(PageOne|PageTwo)\.html http://example.com/MySub/MySubSub2/SubSubSub3/$1.html [R=301,L]


In fact looking at it again, I'm not totally clear on how much of the request you want to capture for reuse. But only the page name varies, right? There's no point in capturing and reusing something that will always be the same; just write it out in full.

Since the whole thing ends in ".html", you may as well leave off the closing anchor. That way, if any extraneous ### sneaks into the path, you can make it go away at the same time, without extra work for the server. But definitely keep the opening anchor.

wilderness

1:15 am on Aug 23, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy,
My apologies for making this so complicated.

I'm not comprehending the capture part at all, and I've hordes of other pages and directories that I intended to rewrite.

The bots been eating 404's for nearly five years, and they still request the pages.

Due to my lack of comprehension of capture, the 404's will simply continue.

Thanks again.