Forum Moderators: phranque

Message Too Old, No Replies

Mod rewrite: subdomains and various issues (duplicate content, etc)

         

wesleyh

8:41 pm on Feb 29, 2008 (gmt 0)

10+ Year Member



Currently I have the following .htaccess: (apache 1.3)


Options +FollowSymlinks
RewriteEngine on

RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^www\.example\.com [NC]
RewriteRule ^([^\.]*)$ /index.php?arguments=$1 [QSA,L]

RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]
RewriteRule ^([^\.]*)$ /front/index.php?subdomain=%1&arguments=$1 [QSA,L]

RewriteCond %{HTTP_HOST} (.*)\.example\.com/robots\.txt [NC]
RewriteRule ^(.*)$ /front/index.php?subdomain=%1&arguments=robots [QSA,L]

First rule simply redirects example.com to www.example.com

Second rule handles all rewrites of www.example.com/sample/url/ or www.example.com/go/to/

Third rule handles subdomains in a similar fashion. This works because rewrite rule execution for www.example.com stops after the second rule, so only subdomains go further.

As you have seen, these two rewrite rules don't accept ".", so no files only paths, which is how my CMS works: /module/content/ or /module/action/what/ etc..

When I change the rewrite rule to:


RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]
RewriteRule ^(.*)$ /front/index.php?subdomain=%1&arguments=$1 [QSA,L]

I immediately get an error that too many redirects have happened.. Why is that?

Anyway on to my other questions,

The fourth rewrite rule doesn't work at all, it doesn't get fired.

When I access sub.example.com/robots.txt it simply displays the robots.txt from www.example.com

Ideally I would like it that files can only be accessed from www.example.com, not from subdomains. Is that possible?

And then I want sub.example.com/robots.txt to simply go to the url i specified.

When I go to bla.example.com/robots.txt it simply shows the robots.txt that is on www.example.com/robots.txt, if I remove the robots.txt in the document root, I now get a 404 error.

Here are my rewrite logs:

For test.example.com/path


127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/path -> path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'path'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/path -> path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'path'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/path -> path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'path'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/path -> path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'path'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (2) [per-dir /Applications/MAMP/htdocs/] rewrite path -> /front/index.php?subdomain=test&arguments=path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) split uri=/front/index.php?subdomain=test&arguments=path -> uri=/front/index.php, args=subdomain=test&arguments=path
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (1) [per-dir /Applications/MAMP/htdocs/] internal redirect with /front/index.php [INTERNAL REDIRECT]
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/front/index.php -> front/index.php
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'front/index.php'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/front/index.php -> front/index.php
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'front/index.php'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/front/index.php -> front/index.php
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'front/index.php'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/front/index.php -> front/index.php
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'front/index.php'
127.0.0.1 - - [29/Feb/2008:09:08:33 +0100] [test.example.com/sid#807df8][rid#8e5b60/initial/redir#1] (1) [per-dir /Applications/MAMP/htdocs/] pass through /Applications/MAMP/htdocs/front/index.php

Is it normal that this seems to test every RewriteRule pattern? (first 4 lines == first 4 rules?) Shouldn't it stop after the second?

Also, once I have done an internal redirect, it applies the rewrite rules again? (But passes through) -- Is it possible to stop these checks from even occuring?

For test.example.com/robots.txt


127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/robots.txt -> robots.txt
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'robots.txt'
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/robots.txt -> robots.txt
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'robots.txt'
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/robots.txt -> robots.txt
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^(.*)$' to uri 'robots.txt'
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] strip per-dir prefix: /Applications/MAMP/htdocs/robots.txt -> robots.txt
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (3) [per-dir /Applications/MAMP/htdocs/] applying pattern '^([^\.]*)$' to uri 'robots.txt'
127.0.0.1 - - [29/Feb/2008:09:10:32 +0100] [test.example.com/sid#807df8][rid#8f4e50/initial] (1) [per-dir /Applications/MAMP/htdocs/] pass through /Applications/MAMP/htdocs/robots.txt

I've switched the third and fourth rewrite rule, but it doesn't seem to have any effect. Still passes through.

For www.example.com/robots.txt


nothing

Regards,
Wesley

jdMorgan

5:06 am on Mar 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Let's try to sort the first question out, and then proceed to the others unless fixing the first problem affects those later problems:

With:
RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]
RewriteRule ^(.*)$ /front/index.php?subdomain=%1&arguments=$1 [QSA,L]

I immediately get an error that too many redirects have happened.. Why is that?


Because you've told it rewrite *any* URL... including the new one. So it rewrites once, mod_rewrite processing restarts (as it does in .htaccess if any rewrites/redirects were invoked) and then rewrites /front/index.php to /front/front/index.php, but with argument now set to "front/index/php".

This process then repeats, merrily rewriting /front/front/index.php to /front/front/front/index.php repeatedly, until the server gives up and issues the "too many internal redirects" or "URL-path too long" error warning.

See if this improves things...


RewriteCond $1 !^front/index\.php$
RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]
RewriteRule (.*) /front/index.php?subdomain=%1&arguments=$1 [QSA,L]

The above assumes that the code is in example.com/.htaccess and not in a subdirectory.

If that helps, then we can get on with the robots.txt issues.

Jim