Forum Moderators: phranque
I have a domain that I use for domain parking, let's say parking.com, in root of this site I have a folder "domains" and I will keep all content of my parked domains inside this folder; actual directory structure on disk is:
parking.com/domains/somesite.com/
parking.com/domains/someothersite.com/
I am doing this by htaccess so that by accessing some-site.com it will show the content of parking.com/domains/some-site.com
I have a trailing slash problem with my code:
If I enter in browser somesite.com/sample/ it works ok but if I enter somesite.com/sample it redirects incorrectly in address bar, it goes to http://somesite.com/domains/somesite.com/sample/ but shows correct content though.
Is there a way to maybe redirect somesite.com/sample to somesite.com/sample/ ?
Any advices/fixes are welcome.
Let me know if I was unclear.
Thank you.
=================================================
Options +FollowSymLinks
RewriteEngine On
# Rewrite requests for <anything>.<domain>.<tld> to /domains/<domain>.<tld> subdirectory
# except for domain name parking.com, this will load from root
#
RewriteCond %{HTTP_HOST} !parking.com
RewriteCond %{HTTP_HOST} !208.xyz.209.186
RewriteCond $1 !^domains
# with or without www redirect to a folder without www
RewriteCond %{HTTP_HOST} ^(www.)?([^.]*)\.(.*)$ [NC]
RewriteRule ^(.*)$ /domains/%2.%3/$1 [L,QSA]
# redirect to index withhout any "if file exists" exception, otherwise root "/" exists and directory list will be printed
RewriteRule ^domains/([^/]*)/$ /domains/?domain=$1
# allow for existent files, redirect otherwise
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^domains/([^/]*)/(.*)$ /domains/?domain=$1&page=$2 [L,QSA]
[edited by: jdMorgan at 2:41 pm (utc) on Nov. 23, 2009]
[edit reason] Obscured specifics [/edit]
Options +FollowSymLinks -Indexes -MultiViews
RewriteEngine on
#
# If the requested parked domain URL-path resolves to an existing file
# or directory, rewrite the request to the parked domain's subfolder
RewriteCond $1 !^domains/
RewriteCond %{HTTP_HOST} !parking\.com [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+(\.[a-z]+)+)\.?(:[0-9]+)?$ [NC]
RewriteCond %{DOCUMENT_ROOT}/%2/$1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/%2/$1 -d
RewriteRule ^(.*)$ /domains/%2/$1 [L]
#
# Else if the requested parked domain URL-path does not resolve to an existing file or
# directory, rewrite the request to the index script in the parked domain subfolder
RewriteCond $1 !^domains/
RewriteCond %{HTTP_HOST} !parking\.com [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+(\.[a-z]+)+)\.?(:[0-9]+)?$ [NC]
RewriteRule ^(.*)$ /domains/?domain=%2&page=$1 [QSA,L]
The hostname pattern has been modified to accept additional hostname variations such as www.parked.co.uk, FQDN-format hostnames (with a trailing period, as in "example.com."), and hostnames with appended port numbers (as in "www.example.com:80"), all of which are valid.
Note that the "file exists" and "directory-exists" tests must look into the parked domain's subfolder itself, so this filepath is "constructed" using DocumentRoot + parked-domain-name + requested-filepath. Also, because they are highly-inefficient (very slow, resource-intensive) functions, these 'exists' checks are done last -- All other RewriteConds must match first before we call the OS filesystem to go check the disk.
Be careful with this code; No adjustments to variables, patterns, anchoring, RewriteCond order, or flags should be necessary. Everything here was done "on purpose" for correctness, efficiency, and robustness.
Jim
I am thinking it has to do with that %{DOCUMENT_ROOT} check, should/could I replace that with something else to find exsitent file/folder? I guess it should check for ^/domains/%domain%/%path% but I dont know how to write it.
RewriteRule ^(.*)$ http://parking.com/domains/?domain=%2&page=$1&folderpath=%{DOCUMENT_ROOT}/%2/$1 [QSA,R=302,L]
Jim
/var/www/html/somedomain.com/sample.html
I have to check in my server files but I think actual file on server would be
/var/www/html/PARKING.COM/domains/somedomain.com/sample.html
That is what should be tested if exsits or not in htaccess rule ?!
That is, if what you say is correct, then %{DOCUMENT_ROOT}/%2/$1 should be %{DOCUMENT_ROOT}/parking.com/%2/%1
I encourage you to experiment, as it's likely going to be a lot faster than waiting for a reply here.
Jim
I checked, I have the www root for parking.com directly into /var/www/html/ so all other domains are in /var/www/html/domains/somedomain.com , this means that condition would be changed to
RewriteCond %{DOCUMENT_ROOT}/domains/%2/%1
That seems to be correct but now I am back where I started.
[somedomain.com...] opens fine, [somedomain.com...] goes to [somedomain.com...]
With this code:
Options +FollowSymLinks -Indexes -MultiViews
RewriteEngine on
#
# If the requested parked domain URL-path resolves to an existing file
# or directory, rewrite the request to the parked domain's subfolder
RewriteCond $1 !^domains/
RewriteCond %{HTTP_HOST} !parking\.com [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+(\.[a-z]+)+)\.?(:[0-9]+)?$ [NC]
RewriteCond %{DOCUMENT_ROOT}/domains/%2/%1 -f [OR]
RewriteCond %{DOCUMENT_ROOT}/domains/%2/%1 -d
RewriteRule ^(.*)$ /domains/%2/$1 [L]
#
# Else if the requested parked domain URL-path does not resolve to an existing file or
# directory, rewrite the request to the index script in the parked domain subfolder
RewriteCond $1 !^domains/
RewriteCond %{HTTP_HOST} !parking\.com [NC]
RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+(\.[a-z]+)+)\.?(:[0-9]+)?$ [NC]
RewriteRule ^(.*)$ /domains/?domain=%2&page=$1 [QSA,L]
I assume that this URL is supposed to be rewritten to /domains/?domain=somedomain.com&page=sample, since "sample" has no extension and therefore cannot exist as a 'real' file. So again, a good test would be to temporarily change the last rule back to an external redirect, and then see if you can 'catch' the server issuing two or more redirects -- either redirecting first to parking.com/domains/?domain=somedomain.com&page=sample and then to parking.com/domains/somedomain/sample/, or the other way around...
You will need to use a server headers checker to see this though, because the redirection will likely be so fast that you'll otherwise only see the last URL in your address bar. I use the Live HTTP Headers add-on for Firefox/Mozilla, although there are several other good ones.
Having seen what order the redirects take place in, it may be easier to figure out what agent is doing these redirects; Among the possibilities are mod_alias, mod_dir, mod_negotiation, mod_speling, or other mod-rewrite rules in your server configuration files, this .htaccess file, or another .htaccess file in /domains or /domains/somedomain.com. It's also possible that your script itself is doing the redirect.
Anyway, since this .htaccess code does not generate external redirects, the problem isn't confined to this code.
Jim
Forbidden
You don't have permission to access /domains/somedomain.com/ on this server
Then after adding DirectorySlash Off, it shows this forbidden error even for somedomain.com/sample instead of doing that douplication in address bar.
Forbidden
You don't have permission to access /domains/somedomain/sample on this server.
Adding a trailing slash in address bar opens correct content (existing folder).
Any other tips? Sorry to waste your time, but for me it might take 20-30 hours of experiments to fix this myself.
And with slash at end it works ok:
somesite.com/sample/ > shows correct content (existent index in that folder)
And same forbiden error if I enter somesite.com in addres bar.
Maybe there is a way to just force site.com/sample to site.com/sample/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}¦/)$
RewriteRule (.*)$ [%{HTTP_HOST}...] [R=301,L]
That would force redirect to an URL with trailing slash if address appears to be a directory but not ending in a slash.
Would this solution cause any problems? With Google maybe?
It does what I want but I don't know if it is correct.
This would be much more efficient:
RewriteCond %{REQUEST_URI} !(\.[a-z0-9]{1,5}¦/)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [R=301,L]
But there's a problem with this at a much higher level, because you are rewriting URLs. Specifically, *no* URL in the filespace tested by %{REQUEST_FILENAME} will ever 'exist' unless it belongs to your parking.com site, and is hosted in the 'root' directory... Any of the pages in your parked domains will fail this test, even if they do exist, because at the point where these 'exists' checks are made, the URL-path has not yet been rewritten to the parked domain's folder.
So, you have four choices:
1) Do this check only after re-mapping to the parked domain filespace (and, unfortunatley, it appears that this did not work).
2) Check using the -U flag instead of -d and -f, to see if the URL will eventually resolve to a file after all rewriting is completed. This is *extremely* inefficient, even compared to the very-inefficient -f and -d checks, because it essentially causes mod_rewrite to 'call' your server again, watch itself translate the URL to a filepath, and then check to see if that filepath leads to an existing file or directory.
3) Forgo checking for 'exists' completely, and simply add a trailing slash to all extensionless URLs requested from your server.
4) Call your team and call your host, and find out what strange configuration or script is causing this problem. This is much better than just putting a band-aid on the problem and forgetting it. The problem that is causing this trouble may re-surface in another form later, and cause you even more grief. A normal server should not do what you are seeing here.
Jim
I'm not sure about jdMorgan's thoughts on the matter, but personally, I just strip the / and /index.ext if present, because it's such an easy check / rewrite to make happen when compared to what you seem to be going through. I haven't dug all the way into the thread to know exactly where the two of you are in the discussion, but thought I'd throw the idea in for those who may read this thread in the future.
Stripping the slashes and index.ext also eliminates the possibility of issues with duplicate content at /dir/ and /dir/index.ext and can be accomplished with a single rule.
Are the files being accessed dynamic and is there a way they could be initiating the redirect, rather than it being done by mod_rewrite or some mysterious server configuration? I sometimes use PHP rather than mod_rewrite for redirects because I can eliminate rules which are generally unnecessary from my .htaccess and once in a while I forget I did, which usually gives me a headache and throws me for a loop for a bit...
I checked for the php scripts, to make sure there isn't something else redirecting and making that double URL, but only thing that could do that would be /domains/index.php, I removed this file from server and when not forcing that trailing slash by htaccess it still did that double URL, so It must be something in my htaccess rules that makes the double URL.
@jdMorgan: I am adding the trailing slash by the last rules you recommended, I didn't really understood the problems you mentioned ("at much higher level...") or I did understood them but that is not happening. All the files inside my main domain (e.g parking.com) and files insinde parked domains seem to open nicely, so I guess I found my solution.
My only worry would be if anything I did would look fishy to Google, but I hope a 301 redirect would be ok with G :)
So what I use now is adding this in front of my initial code and seems to work:
RewriteCond %{REQUEST_URI} !(\.[a-z0-9]{1,5}¦/)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ [%{HTTP_HOST}...] [R=301,L]
IOW: Either comment out all rules and uncomment until it breaks OR start commenting one rule at a time until it stops breaking? This should help you determine where the disconnect is happening in the file. It basically has to be either the commented rule, or something effecting the commented rule above it, so if it's not the commented rule you can keep moving that ruleset up the file one ruleset at a time until it stops breaking, then you know which ruleset if effecting the one cause the break negatively.
I hope this makes a bit of sense, but an example might be better:
RewriteRule #1 - Does Not Cause the Faulty Rewrite
RewriteRule #2 - Does Not Cause the Faulty Rewrite
RewriteRule #3 - Does Not Cause the Faulty Rewrite
RewriteRule #4 - Does Not Cause the Faulty Rewrite
RewriteRule #5 - Causes the Faulty Rewrite
# RewriteRule #6 - Does Not Cause the Faulty Rewrite
# RewriteRule #7 - Does Not Cause the Faulty Rewrite
# RewriteRule #8 - Does Not Cause the Faulty Rewrite
You now know the issue is either with Rule 5 itself (most likely) or with Rule 1 thru 4... If it's not Rule 5 itself, move it above Rule 4... If it doesn't stop 'breaking' move Rule 5 above Rule 3 and so on, until you find the rule or combination of rules causing the error.
If you cannot find the error this way, it must be in the server configuration... I've had an issue with detecting paths on one host when it involves a PHP file, which forces me to remove the start anchor from rules I want to effect the PHP files and has to do with the parsing of PHP as CGI, so it may be something small that needs to be edited in your ruleset to get it to work correctly. I would definitely check the server headers to see which kinds of redirect it is, because hosts are notorious for using 302 undefined redirects and if it is, you may be able to narrow the issue to the server configuration by simply making sure all external redirects in your file are properly defined as 301. If they are you know the error is most likely not in your file.
You might also try removing all rules from the file and seeing if the redirect happens... If it does, then it's obviously not your .htaccess.
Anyway, just some thoughts I was having about an interesting issue.
RewriteCond %{HTTP_HOST} !parking.com
RewriteCond $1 !^domains
RewriteCond %{HTTP_HOST} ^(www.)?([^.]*)\.(.*)$ [NC]
RewriteRule ^(.*)$ /domains/%2.%3/$1 [L,QSA]
It might be related to the ^domains right above, I am not sure.
I'm thinking this through on-the-fly, so go with the idea more than exactly what I say if it's not exactly technically correct, but:
When you Redirect to the / at the beginning of the file it's not happening, so my guess is when you remove that line the Rewrite is happening prior to the Redirect to / and the Redirect is based on a variable which is updated 'in real time', which would cause the external redirect to happen after the Rewrite and Redirect to the Rewrite location if the directories all run off the same .htaccess.
Here's a bit easier way of saying it:
Request to no-trailing/slash
Rewrite to /domains/example.com/no-trailing/slash
Redirect to trailing / kicks in based on a variable updated to the path of the Rewrite...
End Result: /domains/example.com/no-trailing/slash/
If this is the case, I would probably add the no-trailing-slash redirect to the canonicalization ruleset and make sure all rewrites happen after the canonicalization.