Forum Moderators: phranque
You need a 301 redirect which strips index.php from any request that is including the index filename.
You need a catch-all to 301 redirect all non-www requests to the www version.
Finally, you can do your rewrite:
RewriteRule ^subfolder/category/page1.html$ /subfolder/index.php?g=category&page=1 [L] If you don't do the other steps then you leave your website open to Duplicate Content indexing.
The links on your website should also be updated to be in the /subfolder/category/page1.html format.
[edited by: g1smd at 9:40 am (utc) on Sep. 28, 2008]
Take a simple case with two parameters, with a three and a seven digit value, and some common extra fixes that are either necessary or highly desirable:
External URL Format: www.example.com/345/1234567
Internal Server Path: /index.php?cat=345&art=1234567
# Specify acceptable index/root file. You could have a static
# index.html or allow index.php without parameters for root:
DirectoryIndex index.html index.php # Redirect to remove trailing period or comma from URL request
# with parameters, such as from forum with autolink, and force
# www to always be in the URL:
RewriteCond %{QUERY_STRING} ^(([^&]+&)*)[.,]$
RewriteRule (.*) http://www.example.com/$1?%1 [R=301,L] # Redirect to remove trailing period or comma from URL request
# with path, such as from forum with autolink, and force www:
RewriteRule ^(([^/]+/)*)[.,]$ http://www.example.com/$1 [R=301,L] # Redirect two-parameter-based index.php多tml? or / URL request
# (with parameters in any order) to folder-based URL format, and
# force www to always be in URL:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(index\.(php多tml?))?(\?[^\ ]*)\ HTTP/ [NC]
RewriteCond %{QUERY_STRING} &?cat=([0-9]{3})&?
RewriteCond %1>%{QUERY_STRING} ^([^>]+)>([^&]*&)*art=([0-9]{7})&?
RewriteRule ^(index\.(php多tml?))?$ http://www.example.com/%1/%3? [R=301,L] # Force all remaining requests for named index files to drop
# the index file filename, and force www:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]*/)*index\.(html?如hp)(\?[^\ ]*)?\ HTTP/
RewriteRule ^(([^/]*/)*)index\.(html?如hp)$ http://www.example.com/$1 [R=301,L] # General rule to force all non-www URLs to be www URLs.
# This rule must be the last one of the redirects:
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L] # Rewrite such that stray parameters or value names on any index.html?
# or any index.php URL or on / URL request always fail to the
# 404 page:
RewriteCond %{QUERY_STRING} .
RewriteRule ^(index\.(php多tml?))?$ /this.page.does.not.exist [L] # Rewrite URL request: www.example.com/345/1234567 to internal
# path: /index.php?cat=345&art=1234567 to serve content:
RewriteRule ^([0-9]{3})/([0-9]{7})$ /index.php?cat=$1&art=$2 [L] Website now only directly responds with content as "200 OK" for paths like / and /345/1234567 and all other formats either redirect or fail to the 404 Error Page if a real file does not exist. It does this without having to do server-intensive !-d and !-f checks on the filesystem. You can still have physical files like contact.html and they will still work.
The site root can be implemented either as a static index.html page or by using index.php without any parameters. The script will also need to check that, when present, parameter values are acceptable, and fail to the internally script-generated 404 Error Page if not.
The rules in the .htaccess file restrict certain URL requests from ever reaching the filesystem on the server. They are redirected or failed to a 404 immediately.