Forum Moderators: phranque
I'm new to htaccess and I'm using it for a Perl website in order to make the urls look nicer and more search engine friendly.
I have put in some nice rules which work fine like
RewriteRule ^forums/(.*)$ index.cgi?action=forum&board=$1
The one issue I have is for the main homepage. I want all the following to go to the same address.
http://example.com
http://example.com/
http://www.example.com
http://www.example.com/
I have found lots of code that does this but I have the added issue that I want all of the above to end up looking like http://example.com/ but actually (masking) the URL http://www.example.com/cgi-bin/index.cgi
I can't find anything to do this, I've tried adapting lots of stuff I have found.
Note : I'm testing the file on a test subdomain of the website, is there any issues with this (aside from adjusting urls in htaccess to match)?
Any help would be great,
thanks
P.S. A separate issue is that I will be going through every Perl file to change the urls to the new ones, is there an easier way to do this (in htaccess) and how do I make sure the old URLs vanish from search engines (or will they eventually go when all links to them have gone?)
[edited by: jdMorgan at 8:40 pm (utc) on Sep. 18, 2009]
[edit reason] example.com [/edit]
Externally redirect all non-canonical URLs to the canonical URLs, and internally rewrite the canonical URL to a script. Note URL->URL, then URL->filepath... two utterly-different functions.
You *could* use a rewriterule to map example.com/ to /cgi-bin/index.cgi, but you might find it easier to simply declare that filepath as your 'index page' by using the DirectoryIndex directive of Apache mod_dir.
Jim
RewriteBase /cgi-bin/
RewriteRule ^(.*)$ poems.cgi$1 [L]
note, this is code for test domain and poems.cgi is basically my index.cgi
Redirect Loop error on this
[edited by: jdMorgan at 12:32 am (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]
It's not clear how your script "gets" the client-requested URL-path, and rewriting *all* requests means that your script will have to serve up images, CSS, external JavaScript, robots.txt, sitemap.xml, etc. files as well as all of your 'pages,' but this should get you closer:
# Externally redirect requests for "/index.html" to "/"
RewriteRule ^index\.html$ http://test.example.com/ [R=301,L]
#
# Externally redirect all requests for non-canonical hpstnames to canonical domain
RewriteCond %{HTTP_HOST} !^test\.example\.com$
RewriteRule ^(.*)$ http://test.example.com/$1 [R=301,L]
#
# Internally rewrite all requests to /cgi-bin/poems.cgi with requested URL-path
# appended as query string parameter "pathname" (unless already done)
RewriteCond $1 !^cgi-bin/poems\.cgi$
RewriteRule ^(.*)$ /cgi-bin/poems.cgi?pathname=$1 [L]
I no longer get the error, it does load the website, however it's completely unformatted, so I assume it's not calling the css file properly or something.
[test.example.com...]
and this is the main website (how it should look except for the test site has a bright red background so I don't get confused)..
http://www.example.com/cgi-bin/poems.cgi
The CSS file (and a number of other files) are in a folder in the root
[edited by: jdMorgan at 11:14 pm (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]
The actions I deal with separately so they will take the form [test.example.com...]
[edited by: jdMorgan at 11:15 pm (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]
Your formatting is broken because all requests are being rewritten to your script, as I warned about above, and as your original rule would have done, had it worked.
If you want only your 'home page' at "example.com/" to be rewritten to your script, then the last rule needs to change:
# Internally rewrite home page requests to /cgi-bin/poems.cgi
RewriteRule ^$ /cgi-bin/poems.cgi [L]
How do I ensure it rewrites urls without a trailing slash to have one, in order to work with other urls like
RewriteRule ^forums/(.*)/$ poems.cgi?action=forum&board=$1
but not interfere with ones set up to work from html like
RewriteRule ^stuff.html$ poems.cgi?action=popular&read=stuff
(I had to make new rule for each html page I wanted otherwise you could go to anything.html and it would pass anything into the perl file as an input and display it on a page)
None if the rules I posted care about slashes (or the lack thereof). That means the problem is in your script.
Making one rule for each "page" is a non-scalable solution. A better approach is to rewrite all .html requests to the script, and have the script check to see if that's a valid "page." If so, generate the page, and if not, then output a 404-Not Found Status response header (or a 301-Moved Permanently redirect to the correct URL) and exit.
Jim
See the regular-expressions tutorial cited in our Forum Charter. It would be very worthwhile spending a week studying it, even though it'll only take an hour... Regular-expressions are used in many Apache modules and in virtually all scripting languages. It's well worth the investment of your time to become conversant with them...
Jim