Forum Moderators: phranque
I am new here, and I apologize if my problem seems too simple.
I am using a virtual hosting, so my only choice is .htaccess for rewriting URLs. As my hosting space is too big for me, I am trying to use it for hosting my other domain names.
I am trying with a few scripts and currently this is what I use:
---
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
#RewriteLog rewrite.log
#RewriteLogLevel 9
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.org [NC]
RewriteCond %{REQUEST_URI}!^/computer/
RewriteRule (.*) /computer/$1 [L]
# Prevent direct 'computer' type-in or link access
rewritecond %{THE_REQUEST} ^[A-Z]+\ (http://(www\.)?domain\.(org)(:[0-9]+)?)/(computer)/(.*)\ HTTP
rewriterule .* %1/%6 [R=301,L]
---
Yes I have copied it from somewhere in this list, but I have tried other types and got almost the same result.
First that I can not see any rewrite log on my server to be able to track the problem, so I've commented those lines out.
Then links are rewrite with this results (considering I am using domain.org, domain.org should be rewrited to domain.com/computer folder and the are html pages there, including index.html, and the domain everything is
hosted on is domain.com):
1) [domain.org...] : browser shows the 404 page not found, address bar remains unchanged (http://www.domain.org)
2) [domain.org...] same as above.
3) [domain.org...] shows the content of /computer folder correctly, as I desire.
4) [domain.org...] shows the contect of domain.com, images are not shown, URL in address bar remains unchanged. (this might be due my ISP's cache server, as another html file shows correctly).
I have read mod_rewrite guide and documents, 100s of this forum's posts, regex tutorial etc. but I am somehow mixed up as I can not log what is happening (any help on rewrite log?), so I appreciate if you help to understand the problem (most important to me) and then to resolve it?!
Paymaan.
1st start with a simple rewrite:
#RewriteEngine ON
#RewriteRule ^.*$ /computer/index.html [L]
Then request any page except /computer/index.html
If you are getting a 404 and both pages are there, the engine is working correctly, but your path is incorrect. You might use [R,L] if you can't get your log to work, this should show you the address you are trying to access in the browser, so you will know where your path is wrong.
Then add a simple condition:
#RewriteEngine ON
#RewriteCond %{REQUEST_URI}!^(.*)/computer/(.*)
#RewriteRule ^.*$ /computer/index.html [R,L]
Then request any file that is not in /computer
When you get the 'blanket' rules/conditions working, then you can add specifics one at a time. Not only will this will let you see where it breaks, so you can find the correct syntax, path, etc. for each area of your file, it will teach you how to create your own rules.
Remember the 1st part of the RewriteRule is from where you have you .htaccess file, the second is the full path.
Justin
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.org [NC]
RewriteCond %{REQUEST_URI} !^/computer/
RewriteRule (.*) /computer/$1 [L]
#
# Prevent direct 'computer' type-in or link access
RewriteCond %{HTTP_HOST} ^(www\.)?domain\.org
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /computer/([^\ ]*)\ HTTP
RewriteRule . http://%{HTTP_HOST}/%1 [R=301,L]
# Prevent direct 'computer' type-in or link access
RewriteCond %{HTTP_HOST}<->%{THE_REQUEST} ^((www\.)?domain\.org[^<]*)<->[A-Z]+\ /computer/([^\ ]*)\ HTTP
RewriteRule . http://%1/%3 [R=301,L]
Jim
rewrite to html pages take a lot of time and finally gives page not available error.
[*.com...] is available, along all subfolders and files.
[*.org...] gives page not available error.
every other html page at *.org version gives the same error!
current script is this:
----
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org [NC]
RewriteCond %{REQUEST_URI}!^/computer/
RewriteRule (.*) /computer/$1 [R,L]
#
# Prevent direct 'computer' type-in or link access
RewriteCond %{HTTP_HOST}<->%{THE_REQUEST} ^((www\.)?example\.org[^<]*)<->[A-Z]+\ /computer/([^\ ]*)\ HTTP
RewriteRule . [%1...] [R=301,L]
----
[edited by: jdMorgan at 12:57 am (utc) on April 13, 2005]
[edit reason] Removed specifics. [/edit]
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org [NC]
RewriteCond %{REQUEST_URI} !^/computer/
RewriteRule (.*) /computer/$1 [b][L][/b]
#
# Prevent direct 'computer' type-in or link access
RewriteCond %{HTTP_HOST}<->%{THE_REQUEST} ^((www\.)?example\.org[^<]*)<->[A-Z]+\ /computer/([^\ ]*)\ HTTP
RewriteRule . http://%1/%3 [R=301,L]
Removing R helps to see html files again, but [example.org...] and [example.org...] still give the 404 page not found error. Also any subfolders give the same result, just that if I included trailing slash, it gives another type of 404 error.
Any hints? Does it have anything with IndexAuto (AutoIndex?!) etc.? If so, I might be able to change it through my website's control panel, if it helps.
If not, then it's likely that either your DNS is not set up to point these requests to your server, or that the server is not set up to recognize them, or both. The DNS is set up using the 'zone file' for the example.org domain, and the server is set up either by modifying httpd.conf, or possibly by using your 'control panel.'
Jim
<Let's not -- It's against the WebmasterWorld Terms of Service [webmasterworld.com]>
My main site is example.com, there are a few domain pointers set to the same virtual hosting, and I am trying to experiment on one of least used domain pointers (example.org) to see if I can host other low traffic websites of mine there, as I am tired of using low quality free hostings etc. By the way, why use free hostings or pay more money for hosting when my major site has 900MB unused space and lots of free bandwidth?
So to answere you, yes, the org domain is set correctly. before I play with mod_rewrite, it was pointing to the com version. There are no problems with the .com version, it works as usual and shows indexes and other files correctly (with or without trailing slashes, with or without www. etc.).
[edited by: jdMorgan at 4:12 pm (utc) on April 13, 2005]
[edit reason] Removed specifics per TOS. [/edit]
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org [NC]
RewriteCond %{REQUEST_URI} !^/computer/
RewriteRule (.*) /computer/$1 [L]
I doubt this problem has to do with autoindex, unless the alternate domain subdirectory does not contain an 'index.html' or other index file.
When you get the 404 error, what does your access log file say? And what does your error log say?
jd01 supplied a useful debugging tip above: Temporarily using an external redirect should make the rewritten URL visible in your browser address bar. However, this can't be tested with the second ruleset in place, because as previously mentioned, the two rules working against each other will put your server into a loop if an external redirect is used in the first rule.
Jim
example.org/index.html (*.html) file works ok both when the R flag exists and when it does not.
also www.example.org/index.html
but when I remove R flag, example.org and www.example.org give 404 errors.
when R flag exists, both www.example.org and example.org are correctly redirected to their index.html files. It shows the correct path to www.example.org/computer/ and example.org/computer/ and shows their index.html
WHat's wrong with internal rewrites?!
Other than that, code in httpd.conf added by your host or by a "control panel" may be interfering with your new rules.
Jim
I noticed that in my htaccess, I also have two redirect lines before using mod_rewite, can those be of importance regarding this problem?
Here are those lines:
redirect /wwwboard [example.com...]
redirect /example/ [example.com...]
3rd, the .htaccess I am using is the one in the main www directory, is that ok? I do not use mod_rewrite on any subfolders (yet).
Any hints anybody?
Some of the worst mod_rewrite code I've every seen is generated by cpanel.
In order to find this problem, you may have to fully evaluate all of the server config files and all of the .htaccess files in the directory path used to reach the pages that are having problems.
Jim
A mod_rewrite question, how can I redirect pages with [example.org...] or [example.org...] to [example.org...] , and how can I combine such with the following? (considering every following subdirectory will also be directed to its own index.html, say example.org/dir1/ will be redirected to example.org/dir1/index.html)
---
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org [NC]
RewriteCond %{REQUEST_URI}!^/computer/
RewriteRule (.*) /computer/$1 [L]
---
I guess if I combine such rulesets, my problem will be resolved completly, html pages will be rewrited correctly and when a folder, being root or a subfolder is in the URL, it adds the index.html to the URL.
In order to do this, you'd need to use RewriteCond with the -d flag to check that the requested URL is a directory. This means an extra filesystem access for each and every HTTP request, and is inefficient.
You'd be far better off to figure out why accesses to any_dir/ don't automatically return the index file, as they should, even without mod_rewrite. If you add more complication to something that is broken, you often get more complication but still broken.
If you put
DirectoryIndex index.html
Options -Indexes
Having done that, any request for / should return /index.html and any request for directory/ should return directory/index.html
If not, you need to have your host fix your account.
SEO note: For purposes of SEO and usability, you want your main index page to be example.com/ and not example.com/index.html. Otherwise, you will split the PageRank between the / and /index.html URLs. In addition to that, what if you later decide to change your index page to index.php? Then you'd lose all the PageRank for the main page of your site, until Google got around to fully indexing it again... :(
If you have UseCanonicalName off, and Apache mod_dir is correctly installed, then any request for directory_name should be redirected to sirectory_name/ without you having to add any code anywhere. It should just work (see mod_dir documentation).
If not, you need to have your host fix your account.
You may also wish to try adding
Options -MultiViews if you don't use content negotiation.
Please try these various tests and let me know what heppens in detail. Otherwise, it will be up to your hosting company to help you unless someone else here has any more ideas. On Apache, .htaccess is subordinate to httpd.conf, and it is rare that you can 'fix' a problem in httpd.conf by using .htaccess. The power of .htaccess is intentionally limited for security and server confguration control reasons, and that means that errors in configuration cannot usually be fixed by individual server users.
Jim
Everything remains ok in the example.com and other domain pointers in both cases, so something related to internal rewrites should be wrong.
In case of external rewrites, doing [example.org...]
is being rewrited to [example.org...] and it workd correctly, I mean it shows index.html
I am quiet mixed up, working with a such a big hosting, and having such a stupid problem :)
I think Jim's idea of contacting your hosting co. to determine the current configuration of the httpd.conf file is very wise, otherwise you may be compounding a problem, which could create undesired results.
My advice... Contact your host for the httpd.conf file, then generalize and post that with the generalized portion of your .htacces file that you are using to deal with the directory vs directory/index.html situation.
In giving people the opportunity to look at both simultaneously, you should be able to get some solid feedback/direction for your situation.
Justin
I am still curious to see if there is any mod_rewrite solution to separate URLS containing no file name and adding an index.html file to them, I know it can be not efficient, but it is still intersting to me.
Thanks.
Paymaan.
Finally somebody in my hosting worked on problem and changed the script to this, and it works almost ok, I would like to thank all who gave me hints, as well as to studying this working one, let me know what has been the problem, and if we all have missed something in the old non-working script?!
here it is, to make Jim's life easier, I change the actual domain name to example.org:
Options +FollowSymlinks
RewriteEngineOn
#RewriteBase/
## prevent direct '/computer' requests on example.org
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org[NC]
RewriteCond %{THE_REQUEST} ^GET\ /computer
RewriteRule ^computer/(.*) [%{HTTP_HOST}...] external redirect
## rewrite everything else into /computer/
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org[NC]
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/computer/
## internal redirect
RewriteRule ^(.*)$ /computer/$1[L]
Paymaan.
[edited by: jdMorgan at 7:08 pm (utc) on April 28, 2005]
[edit reason] Examplified. [/edit]
It looks like Jim's example will fit your needs
RewriteCond $1![^.]+\.[^/]+$
RewriteRule (.*)/?$ [example.com...] [R=301,L]
This externally rewrites everything to the /file/index.html file...
Or, could be adjusted to something like this, depending on exactly what you need:
RewriteCond $1![^.]+\.[^/]+$
RewriteRule (.*)/?$ /computer/index.html [L]
This is a silent version that will take any file (directory) EG /file or /file/ and write the contents of computer/index.html to it...
You might need to adjust it a little more depending on your specific purposes, but I think the idea should give you some direction.
Your complete file would look like this:
Options +FollowSymlinks
RewriteEngineOn
#RewriteBase/
## prevent direct '/computer' requests on example.org
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org[NC]
RewriteCond %{THE_REQUEST} ^GET\ /computer
RewriteRule ^computer/(.*) [%{HTTP_HOST}...] external redirect
RewriteCond $1![^.]+\.[^/]+$
RewriteRule (.*)/?$ [example.com...] [R=301,L]
## rewrite everything else into /computer/
RewriteCond %{HTTP_HOST} ^(www\.)?example\.org[NC]
RewriteCond %{REQUEST_URI}!^/$
RewriteCond %{REQUEST_URI}!^/computer/
## internal redirect
RewriteRule ^(.*)$ /computer/$1[L]
I would also recommend changing this:
[%{HTTP_HOST}...]
to this:
[%{HTTP_HOST}...]
The difference is a permanent move vs the temporary one you are using currently.
With all the 'drama' surrounding the 302's of late, I would be safe rather than sorry, unless you know you need to use a temporary move.
The reason for adding in the middle of your current file is so we make sure to catch requests properly and in order.
The first 'set' checks to see if the original request is for /computer and responds accordingly.
Then the second 'set' (added) checks to make sure there is a file appended to the directory (/anything/anything.any), and if not rewrites to the correct format (/anything/index.html)
Finally, the third 'set' checks to see if /computer is already being used and if not serves (internally redirects) to /computer/anything.any.
Please, notice the order is important, and can be the reason a rule or 'set' will fail...
If this does not work, you might try some different ordering, but I think this is the correct version of what you are looking for.
Hope it finally does the trick.
Justin
"Subdirectories do not work in this case because, in the sequence of processing steps that take place within the server for each web
request,
mod_rewrite is invoked long after the module that turns subdirectory requests into requests for the "index.html" file within the requested subdirectory. The reasons for this are a little bit complicated, but are explained in the mod_rewrite documentation, which is available
here:[httpd.apache.org...]
Such delay should not exist I guess. Please correct me if this is true so I'll try to convice them to fix it.
Paymaan.