Forum Moderators: phranque
AddType application/x-httpd-php5 .htm .html
RewriteEngine On
RewriteRule ^(guestbook_[0-9]+\.htm) http://www.example.com/reviews/$1 [R=301,L]
RewriteRule ^espanol/(libro_de_visitas_[0-9]+\.htm) http://www.example.com/espanol/opiniones/$1 [R=301,L]
RewriteRule ^svenska/(gastbok_[0-9]+\.htm) http://www.example.com/svenska/kommentarer/$1 [R=301,L]
RewriteRule ^(z3originalguestbook_[0-9]+\.htm) http://www.example.com/reviews/$1 [R=301,L]
RewriteRule ^espanol/(z3original_[0-9]+\.htm) http://www.example.com/espanol/opiniones/$1 [R=301,L]
RewriteRule ^svenska/(z3originalgastbok_[0-9]+\.htm) http://www.example.com/svenska/kommentarer/$1 [R=301,L]
# REDIRECT htm INDEX PAGES to index/
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html?\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html?$ http://www.example.com/$1 [R=301,L]
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html?\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html?$ https://www.example.com/$1 [R=301,L]
# Get rid of extra path info such as example.com/pagina1.htm/maps/ etc
RewriteRule ^((?:[^./]+/)*[^./]+\.(?:html?|php))/ http://www.example.com/$1 [R=301,L]
# Redirect non-canonical to www
RewriteCond %{SERVER_PORT} !^443$
RewriteCond %{HTTP_HOST} !^(www\.example.com\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{HTTP_HOST} !^(www\.example.com\.com)?$
RewriteRule (.*) https://www.example.com/$1 [R=301,L]
AddType 'text/css; charset=UTF-8' css
<Files ~ "\.(log)$">
order allow,deny
deny from all
</Files>
<FilesMatch "\.(pl|txt|htm|html|[sf]?cgi|spl)$">
Header set Cache-Control: "max-age=7200"
<filesMatch "\.(htm|html|css|js)$">
What more to do?
[edited by: aakk9999 at 12:48 pm (utc) on Dec 22, 2013]
[edit reason] Exemplified [/edit]
How many of such URLs are you getting?
I think you have two issues:
1) find out where/how these URLs are created and make sure you fix the issue
2) For such URLs that have been indexed, either redirect them back to canonical URL or return 404/410, which is what should be done in your .htaccess
Currently there is no rule in your .htaccess that would be dealing with these URLs
<Files ~ "\.(log)$"> RewriteCond %{QUERY_STRING} z=any&t=anytype&d=1&day=08&month=2014-06&day2=01&month2=2013-12&e=Search
RewriteRule ^sales/index\.htm$ /sales/? [L,R=301]
RewriteCond %{THE_REQUEST} document\.body
RewriteRule {URL-that-gets-this-parameter} - [R=404,L]
<off topic>
<Files ~ "\.(log)$">
</off>
I thought Regular Expressions only worked in FilesMatch :(
One of the standard bits of advice is: First explain in English what you want to do.
RewriteCond %{QUERY_STRING} z=any&t=anytype&d=1&day=08&month=2014-06&day2=01&month2=2013-12&e=Search
RewriteRule ^sales/index\.htm$ /sales/? [L,R=301]
That seems very tightly constrained. Is that the exact, literal text of the only query string you ever get? Even if you replaced each value with \d+ or 20\d\d-\d\d and so on, how many matches would you get?
Are you also trying to get rid of all the "document.body.etcetera" garbage? What happens if you try
RewriteCond %{THE_REQUEST} document\.body
RewriteRule {URL-that-gets-this-parameter} - [R=404,L]
I have tried:
espanol/ventas/ espanol/ventas/
with no anchors should cover all possible forms. (Does it make your skin crawl to spell it without the tilde? It does to me!)
Try putting it at the very beginning of all RewriteRules, right after RewriteEngine On. It isn't the ideal location for a 404, but it means no other rule has a chance to get involved.
RewriteCond %{QUERY_STRING} !^(id=[0-9]*)?$
RewriteCond %{QUERY_STRING} !^(id=[0-9]*)?$
Means: the query string is exactly "id=some-number" or "id=" (no value), or "" exactly nothing. Is that what your query string is supposed to be?
But wait. All your pages are really php, aren't they? I seem to remember you parse everything for php even if it's got an htm(l) extension. Might it be easier for a single php script to read the query string and issue a 301 redirect if appropriate? I assume you've already got something that issues a 404 if a parameter value is wrong.