Forum Moderators: phranque

Message Too Old, No Replies

Unintentional site effects of friendly URI rewrites

         

Wayder

8:18 pm on Oct 26, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



(Apache/1.3.37)

I have written the following so I can create friendly URI's and it works for the URI's that I intended to rewrite and I am pretty happy with that. However it also has unintended effects other URI's.

My .htaccess is as follows:
AddHandler application/x-httpd-php .htm
ErrorDocument 404 /ErrorPage.htm

RewriteEngine on
Options +FollowSymLinks

## Rewrite index or subdirectory/index to domain.com/ or domain.com/subdirectory/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.example.com/$1 [R=301,L]

## Rewrite non www to www
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

# Rewrite incoming
RewriteRule ^prod/([^.]+).htm$ index.htm?product=$1 [L]
RewriteRule ^([^/]+)/index-([^/]+).htm$ index.htm?cat0=$1&page=$2 [L]
RewriteRule ^([^/]+)/index.htm$ index.htm?cat0=$1 [L]
RewriteRule ^([^/]+)/$ index.htm?cat0=$1 [L]

RewriteRule ^([^/]+)/([^/]+)/index-([^/]+).htm$ index.htm?cat0=$1&cat1=$2&page=$3 [L]
RewriteRule ^([^/]+)/([^_]+)_([^/]+).htm$ index.htm?cat0=$1&cat1=$2&page=$3 [L]
RewriteRule ^([^/]+)/([^/]+)/$ index.htm?cat0=$1&cat1=$2 [L]
RewriteRule ^([^/]+)/([^/]+).htm$ index.htm?cat0=$1&cat1=$2 [L]

RewriteRule ^privacy.htm$ /index.htm?privacy [L]
RewriteRule ^contact.htm$ /index.htm?contact [L]

I have tested it with the following URI's and get the following results.

The 'System 404' error is:
The requested URL [URI HERE] was not found on this server. Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
so its not finding my ErrorPage.htm either :(

I created a list of URI's that I have tested, what the result should be and what the result is and listed them below.

The format is as follows:
URI ¦ Result should be ¦ Result is

http://www.example.com/ ¦ OK ¦ OK
http://www.example.com/privacy.htm ¦ OK ¦ OK
http://www.example.com/contact.htm ¦ OK ¦ OK
http://www.example.com/index.htm ¦ Redirect to / ¦ Redirect to /
http://www.example.com/index.php ¦ Custom 404 ¦ System 404
http://www.example.com/prod/ ¦ Custom 404 ¦ OK
http://www.example.com/prod/index.htm ¦ Custom 404 ¦ OK
http://www.example.com/prod/index.php ¦ Custom 404 ¦ System 404
http://www.example.com/prod/index-2.htm ¦ Custom 404 ¦ OK
http://www.example.com/prod/index-2.php ¦ Custom 404 ¦ System 404
http://www.example.com/prod/354.htm ¦ OK ¦ OK
http://www.example.com/prod/345.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/ ¦ OK ¦ OK
http://www.example.com/cat0/index.htm ¦ Redirect to /cat0/ ¦ Redirect to /cat0/
http://www.example.com/cat0/index.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/index-2.htm ¦ OK ¦ OK
http://www.example.com/cat0/index-2.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/345.htm ¦ Custom 404 ¦ OK
http://www.example.com/cat0/345.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/cat1/ ¦ OK ¦ OK
http://www.example.com/cat0/cat1/index.htm ¦ Redirect to /cat0/cat1/ ¦ Redirect to /cat0/cat1/
http://www.example.com/cat0/cat1/index.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/cat1/index-2.htm ¦ OK ¦ OK
http://www.example.com/cat0/cat1/index-2.php ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/cat1/345.htm ¦ Custom 404 ¦ System 404
http://www.example.com/cat0/cat1/345.php ¦ Custom 404 ¦ System 404

If I can Get them all to work as intended using .htaccess that would be perfect. If not I could allow them through and catch them with my scripts and call a 404 then.

Can anyone help me with this please?

Edit: to say that cat0 & cat1 are variables format normally like 'this-category'

[edited by: Wayder at 8:21 pm (utc) on Oct. 26, 2007]

Wayder

9:06 pm on Oct 26, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



Darn it, I hate it when I do stupid stuff.

All the System 404 errors were because I didnt upload my ErrorPage.htm Grrrrrrr.....

Only 4 errors to solve and I think I can do that through the scripts although if anyone knows how I can do it in .htaccess, I would really appreciate knowing about it.

Errors now are:

The format is as follows:
URI ¦ Result should be ¦ Result is

http://www.example.com/prod/ ¦ Custom 404 ¦ OK
http://www.example.com/prod/index.htm ¦ Custom 404 ¦ OK
http://www.example.com/prod/index-2.htm ¦ Custom 404 ¦ OK
http://www.example.com/cat0/345.htm ¦ Custom 404 ¦ OK

Thanks

jdMorgan

9:48 pm on Oct 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



/prod/ is rewritten by rule #6, and so will not 404.
/prod/index.htm is rewritten by rule #5, and so will not 404.
/prod/index-2.htm is rewritten by rule #4, and so will not 404.
/cat0/345.htm is rewritten by rule #10, and so will not 404.

Clearly the code is working correctly as it is written. It's just not written to do exactly what you want...

Also (unrelated to your problem) your patterns need the literal periods escaped. Example:


# Rewrite incoming
RewriteRule ^prod/([^.]+[b])\.h[/b]tm$ index.htm?product=$1 [L]

Completely flush your browser cache before testing any change to this config code.

Jim

Wayder

10:51 pm on Oct 26, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks Jim,

added escape for literal periods.
added trailing slash code
re-wrote/re-ordered RewriteRules

I am left with one "error". I dont think however that I can handle this with apache because the URI is constructed exactly the way that the problematic URI is, so I think I will have to deal with this in the scripts.

With that I will be quite satisfied UNLESS you know that I have created a problem in my .htaccess code

my .htaccess now:

AddHandler application/x-httpd-php .htm
ErrorDocument 404 /ErrorPage.htm

RewriteEngine on
Options +FollowSymLinks

## Rewrite index or subdirectory/index to domain.com/ or domain.com/subdirectory/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.example.com/$1 [R=301,L]

## Rewrite non www to www
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

## Add trailing slash
# If the requested URI does not contain a period in the final path-part
RewriteCond %{REQUEST_URI}!\.[^./]+$
# If requested URI doesn't contain a trailing slash
RewriteCond %{REQUEST_URI}!(.*)/$
RewriteRule ^(.*)$ http://www.example.com/$1/ [L,R=301]

# Rewrite all incoming links
RewriteRule ^prod/([^.]+)\.htm$ index.htm?product=$1 [L]
RewriteRule ^prod/$ ErrorPage.htm [R=404, L]
RewriteRule ^([^/]+)/([^/]+)/index-([^/]+)\.htm$ index.htm?cat0=$1&cat1=$2&page=$3 [L]
RewriteRule ^([^/]+)/index-([^/]+)\.htm$ index.htm?cat0=$1&page=$2 [L]
RewriteRule ^([^/]+)/index\.htm$ index.htm?cat0=$1 [L]
RewriteRule ^([^/]+)/([^/]+)/$ index.htm?cat0=$1&cat1=$2 [L]
RewriteRule ^([^/]+)/$ index.htm?cat0=$1 [L]

RewriteRule ^privacy\.htm$ /index.htm?privacy [L]
RewriteRule ^contact\.htm$ /index.htm?contact [L]

Can you see any problems in my code now?
Is there anything here that you think will bite me in the butt?

Thanks.

g1smd

11:19 pm on Oct 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your first two comments do not accurately reflect what the following code block actually does:

## Rewrite index or subdirectory/index to domain.com/ or domain.com/subdirectory/

## Rewrite non www to www/

This is more accurate:

## Redirect index or subdirectory/index to www.domain.com/ or www.domain.com/subdirectory/

## Redirect non-www to www/

Wayder

11:32 pm on Oct 26, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks, Altered :)

I also moved the "add trailing slash" above "rewrite non www to www/" because if the url had both missing then it woud take two rules to fix it in the old order but only one in the new.

I found an issue with trying to make
RewriteRule ^prod/$ ErrorPage.htm [L]

into a 404. I added thr R=404 as an afterthought when I was posting.

The manual says that it must be a "valid URL" which it is, and that I can use "codes in the range 300-400". So maybe I cant use a 404.

Does anyone know how to do this as that redirect returns a 200 right now and it shouldnt.

Thanks

jdMorgan

12:12 am on Oct 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Simply rewrite it to a URL-path that does not exist, using the [L] flag (only).

Jim

Wayder

6:39 pm on Oct 27, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



My final version FYI:-

AddHandler application/x-httpd-php .htm
ErrorDocument 404 /ErrorPage.htm

RewriteEngine on
Options +FollowSymLinks

## Rewrite index or subdirectory/index to www.domain.com/ or www.domain.com/subdirectory/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.example.com/$1 [R=301,L]

## Add trailing slash to
# If the requested URI does not contain a period in the final path-part
RewriteCond %{REQUEST_URI}!\.[^./]+$
# If requested URI doesn't contain a trailing slash
RewriteCond %{REQUEST_URI}!(.*)/$
RewriteRule ^(.*)$ http://www.example.com/$1/ [R=301,L]

## Rewrite non www to www/
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

## Rewrite all incoming links
RewriteRule ^prod/([^.]+)\.htm$ index.htm?product=$1 [L]
RewriteRule ^prod/$ doesnotexist.htm [L]
RewriteRule ^([^/]+)/([^/]+)/index-([^/]+)\.htm$ index.htm?cat0=$1&cat1=$2&page=$3 [L]
RewriteRule ^([^/]+)/index-([^/]+)\.htm$ index.htm?cat0=$1&page=$2 [L]
RewriteRule ^([^/]+)/index\.htm$ index.htm?cat0=$1 [L]
RewriteRule ^([^/]+)/([^/]+)/$ index.htm?cat0=$1&cat1=$2 [L]
RewriteRule ^([^/]+)/$ index.htm?cat0=$1 [L]

RewriteRule ^privacy\.htm$ /index.htm?privacy [L]
RewriteRule ^contact\.htm$ /index.htm?contact [L]

Thanks for all your help.

g1smd

9:22 pm on Oct 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You still missed the word Redirect in the comments.

Wayder

9:44 pm on Oct 27, 2007 (gmt 0)

10+ Year Member Top Contributors Of The Month



True,

Thank you. Done :)