Forum Moderators: phranque

Message Too Old, No Replies

mod_rewrite & subdirectories

         

compooter

8:26 am on Jan 29, 2005 (gmt 0)

10+ Year Member



I am working with a rather large website that is very large, ~900pgs. The root-level static pages were used for the search-engine-friendly dynamic page templates (existing in pseudo-directories) and inadvertently the main navigation was left relative to page, rather than root. Result being that all the pages that were the core navigation, glossary.html/resource.html/etc, had links in a gajillion different "directories", and consequently search engines indexed a ton of non-existent files.

Like this:

/index.html
/glossary.html
/resources.html
/archives.html
/dynamic/page/file-links-to-pages-below
/dynamic/page/glossary.html
/dynamic/page/resources.html
/dynamic/page/archives.html

So I know basically what I want to do: set rules for each root-level static page that if the filename is one of those files and not requested from root, then permanently redirect to the /fromroot.html version. I've also considered checking to see if the request actually exists as a file, and if not, redirect -- but that seemed like it might have unforeseen consequences.

Make sense? I've tried a few regular expressions unsucessfully. I know what I'll be trying is something like this:

RewriteRule ^([a-zA-Z0-9-_ ]+)/glossary.html$ /glossary.html [R=301,L]

One for each page, 5 total. That was my last wild guess at a solution, am I on the right track? Basically it just needs to redirect any request for that file not from root, back to root.

johnt

12:29 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



I think that the code below should do what you're looking for.

RewriteCond {%HTTP_REQUEST}!/glossary.html$
RewriteRule (.*)/glossary.html$ /glossary.html [R=301, L]

I think that should catch any number of subdirectories

Hope it helps

John

jdMorgan

4:41 pm on Jan 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Seems to me that your code should have worked if placed in .htaccess. It's just a little more complex than actually required.

For .htaccess, I'd recommend:


Options +FollowSymLinks
RewriteEngine on
RewriteRule ^.+/glossary\.html$ /glossary.html [R=301,L]

For httpd.conf:

Options +FollowSymLinks
RewriteEngine on
RewriteRule ^/.+/glossary\.html$ /glossary.html [R=301,L]

However, you'll need to correct those relative links if they are still there. Otherwise the search engines will find them every time they spider, and have to go through the 301's to sort them out every time.

Flush your browser cache before testing any change to your access-control code.

Jim

compooter

5:32 pm on Jan 29, 2005 (gmt 0)

10+ Year Member



Very nice! Thanks to you both; that worked out the rest of the kinks. And yes, the links have been switched. Time to whip those search engines into shape.

compooter

7:21 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



Ok, one more question. Say I have about 25-30 static pages. I'd like to not have to write 20-30 rules. I know that all pages that are legitimate and in a subdirectory have one of three filenames. Would it be possible to write the converse of this rule that would say "If the file is in any subdirectory, isn't -f or -d, and doesn't have the filename a.html b.html or c.html, redirect to the same filename relative to root"?

Just a stab in the dark: (untested)


RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !(a¦b¦c)\.html
RewriteRule ^(.*)$ /$1 [R=301,L]

jdMorgan

9:52 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sure, try it and experiment with it... much faster in most cases than waiting for a reply here!

Actually I'm not exactly sure whether you are proposing that code as a solution, or whether you're looking for the oppositie of that code... English conversational usage conventions often conflict with those of logic.

Jim