Forum Moderators: phranque
# 3 words
RewriteCond %{QUERY_STRING} keywords=([^+]+)\+([^+]+)\+([^+]+)
RewriteRule ^search.cgi /%1-%2-%3.shtml? [R=301,L]
RewriteCond %{QUERY_STRING} keywords=([^%]+)\%20([^%]+)\%20([^%]+)
RewriteRule ^search.cgi /%1-%2-%3.shtml? [R=301,L]
# 2 words
RewriteCond %{QUERY_STRING} keywords=([^+]+)\+([^+]+)
RewriteRule ^search.cgi /%1-%2.shtml? [R=301,L]
RewriteCond %{QUERY_STRING} keywords=([^%]+)\%20([^+]+)
RewriteRule ^search.cgi /%1-%2.shtml? [R=301,L]
# 1 word
RewriteCond %{QUERY_STRING} keywords=([^+]+)
RewriteRule ^search.cgi /%1.shtml? [R=301,L]
[dmoz.org...]
1. If you type something into my search box with one or more capitals you get an error. So the person's entry needs to be cleaned of capitals.
2. If someone types out a URL with a capital and Google indexes it and the version with no capital then I have a duplicate URL problem.
1. If you type something into my search box with one or more capitals you get an error. So the person's entry needs to be cleaned of capitals.
Your search engine is very limited if cannot handle upper case letters! A simple stringtolower command(in the search script) and that's fixed.
2. If someone types out a URL with a capital and Google indexes it and the version with no capital then I have a duplicate URL problem.
Google cannot index URLs that are formed from user searches via a form. Just because you may be using a GET method on your form(bad) and you see the parameters in the address field of the browser, doesn't mean Google is indexing them.
I used to have the rewrite rewriting to /gift-baskets/ to clean the URL
all was fine
now changing to /gift-baskets.shtml invokes a problem caused by the rewrite code not the search engine. The engine in question is a seperate application pulling from other places. The query is taken raw from the form, rewritten to clean the url. Then another .htaccess file in the www root reconverts it back into cgi for processing so that the URL remains clean. Please look at the original .htaccess above.
2. I am not using a GET method, bad. I meant that if someone actually put the URL on one of their web pages and then Google spidered that then it would be problematic.