Forum Moderators: phranque

Message Too Old, No Replies

htaccess (non typical use, can't find a solution)

         

DeepUnderground

4:02 pm on Sep 18, 2009 (gmt 0)

10+ Year Member



Hi,

I'm new to htaccess and I'm using it for a Perl website in order to make the urls look nicer and more search engine friendly.

I have put in some nice rules which work fine like

RewriteRule ^forums/(.*)$ index.cgi?action=forum&board=$1

The one issue I have is for the main homepage. I want all the following to go to the same address.

http://example.com
http://example.com/
http://www.example.com
http://www.example.com/

I have found lots of code that does this but I have the added issue that I want all of the above to end up looking like http://example.com/ but actually (masking) the URL http://www.example.com/cgi-bin/index.cgi

I can't find anything to do this, I've tried adapting lots of stuff I have found.

Note : I'm testing the file on a test subdomain of the website, is there any issues with this (aside from adjusting urls in htaccess to match)?

Any help would be great,

thanks

P.S. A separate issue is that I will be going through every Perl file to change the urls to the new ones, is there an easier way to do this (in htaccess) and how do I make sure the old URLs vanish from search engines (or will they eventually go when all links to them have gone?)

[edited by: jdMorgan at 8:40 pm (utc) on Sep. 18, 2009]
[edit reason] example.com [/edit]

jdMorgan

8:43 pm on Sep 18, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It sounds like you're expecting to do two different functions with one rule...

Externally redirect all non-canonical URLs to the canonical URLs, and internally rewrite the canonical URL to a script. Note URL->URL, then URL->filepath... two utterly-different functions.

You *could* use a rewriterule to map example.com/ to /cgi-bin/index.cgi, but you might find it easier to simply declare that filepath as your 'index page' by using the DirectoryIndex directive of Apache mod_dir.

Jim

DeepUnderground

9:46 pm on Sep 18, 2009 (gmt 0)

10+ Year Member



mod_rewrite is the only function I can access on my hosting.

Having it happen in two different rules is fine, but I can't seem to get it to work, I've tried lots of ways.

jdMorgan

11:27 pm on Sep 18, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please post your best-effort code as a basis for discussion.

Thanks,
Jim

DeepUnderground

11:56 pm on Sep 18, 2009 (gmt 0)

10+ Year Member



RewriteCond %{HTTP_HOST} ^test.example.com
RewriteRule (.*) [test.example.com...] [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ [test.example.com...] [R=301,L]

RewriteBase /cgi-bin/

RewriteRule ^(.*)$ poems.cgi$1 [L]

note, this is code for test domain and poems.cgi is basically my index.cgi

Redirect Loop error on this

[edited by: jdMorgan at 12:32 am (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]

jdMorgan

12:42 am on Sep 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, since the 'target' path of your last rule matches its input pattern, it'll loop forever unless you exclude the target path from being rewritten.

It's not clear how your script "gets" the client-requested URL-path, and rewriting *all* requests means that your script will have to serve up images, CSS, external JavaScript, robots.txt, sitemap.xml, etc. files as well as all of your 'pages,' but this should get you closer:


# Externally redirect requests for "/index.html" to "/"
RewriteRule ^index\.html$ http://test.example.com/ [R=301,L]
#
# Externally redirect all requests for non-canonical hpstnames to canonical domain
RewriteCond %{HTTP_HOST} !^test\.example\.com$
RewriteRule ^(.*)$ http://test.example.com/$1 [R=301,L]
#
# Internally rewrite all requests to /cgi-bin/poems.cgi with requested URL-path
# appended as query string parameter "pathname" (unless already done)
RewriteCond $1 !^cgi-bin/poems\.cgi$
RewriteRule ^(.*)$ /cgi-bin/poems.cgi?pathname=$1 [L]

Jim

DeepUnderground

8:55 am on Sep 19, 2009 (gmt 0)

10+ Year Member



Thanks Jim.

I no longer get the error, it does load the website, however it's completely unformatted, so I assume it's not calling the css file properly or something.

[test.example.com...]

and this is the main website (how it should look except for the test site has a bright red background so I don't get confused)..

http://www.example.com/cgi-bin/poems.cgi

The CSS file (and a number of other files) are in a folder in the root

[edited by: jdMorgan at 11:14 pm (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]

DeepUnderground

8:54 pm on Sep 19, 2009 (gmt 0)

10+ Year Member



Also I'm not trying to direct the actions (like poems.cgi?action=latest) to test.deepundergroundpoetry.com I only want the main (home page), so poems.cgi without any action after it to go there.

The actions I deal with separately so they will take the form [test.example.com...]

[edited by: jdMorgan at 11:15 pm (utc) on Sep. 19, 2009]
[edit reason] example.com [/edit]

jdMorgan

11:19 pm on Sep 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please do not post your domain -- we don't want to rank for your domain, and you won't like that result either, especially if your server isn't working properly yet...

Your formatting is broken because all requests are being rewritten to your script, as I warned about above, and as your original rule would have done, had it worked.

If you want only your 'home page' at "example.com/" to be rewritten to your script, then the last rule needs to change:


# Internally rewrite home page requests to /cgi-bin/poems.cgi
RewriteRule ^$ /cgi-bin/poems.cgi [L]

Jim

DeepUnderground

11:48 pm on Sep 19, 2009 (gmt 0)

10+ Year Member



thanks so much, it appears to work great now.

[edited by: DeepUnderground at 11:58 pm (utc) on Sep. 19, 2009]

DeepUnderground

11:52 pm on Sep 19, 2009 (gmt 0)

10+ Year Member



final thing (hopefully).

How do I ensure it rewrites urls without a trailing slash to have one, in order to work with other urls like

RewriteRule ^forums/(.*)/$ poems.cgi?action=forum&board=$1

but not interfere with ones set up to work from html like

RewriteRule ^stuff.html$ poems.cgi?action=popular&read=stuff

(I had to make new rule for each html page I wanted otherwise you could go to anything.html and it would pass anything into the perl file as an input and display it on a page)

DeepUnderground

12:45 am on Sep 20, 2009 (gmt 0)

10+ Year Member



oddly
[test.example.com...]
works but
[test.example.com...]
doesn't. However
[test.example.com...]
works and so does
[test.example.com...]

jdMorgan

3:57 am on Sep 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Slashes and "one rule for each 'html' page":

None if the rules I posted care about slashes (or the lack thereof). That means the problem is in your script.

Making one rule for each "page" is a non-scalable solution. A better approach is to rewrite all .html requests to the script, and have the script check to see if that's a valid "page." If so, generate the page, and if not, then output a 404-Not Found Status response header (or a 301-Moved Permanently redirect to the correct URL) and exit.

Jim

DeepUnderground

9:59 am on Sep 20, 2009 (gmt 0)

10+ Year Member



Thanks

How do I change the following to be one rule

RewriteRule ^read/(.*)/$ poems.cgi?action=read&id=$1 [L]
RewriteRule ^read/(.*)$ poems.cgi?action=read&id=$1 [L]

regards

amy

jdMorgan

3:31 pm on Sep 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Make the trailing slash optional by following it with the regular-expressions "zero or one" quantifier "?".

See the regular-expressions tutorial cited in our Forum Charter. It would be very worthwhile spending a week studying it, even though it'll only take an hour... Regular-expressions are used in many Apache modules and in virtually all scripting languages. It's well worth the investment of your time to become conversant with them...

Jim

g1smd

2:00 am on Sep 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Note that allowing two different URLs to trigger the same rewrite means that you now have a Duplicate Content problem. You should redirect one URL, and rewrite the other.

jdMorgan

2:20 am on Sep 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not necessarily, because we've already discussed how the script must validate the request (four posts above)...

Jim