Forum Moderators: phranque
For example, I have this url being indexed once for each page:
www.example.com/cat-m-77.html?page=1&sort=2a
www.example.com/cat-m-77.html?page=2&sort=2a
I also have query strings beginning with ?listing or ?filter_id, etc, etc.
I want a generic enough rule that I can put in place to catch all of them and in the example above, redirect it to:
www.example.com/cat-m-77.html
Make sense? I have this section currently in my .htaccess file that I was hoping to expand upon to include the above:
#
# Skip the next two rewriterules if NOT a spider
RewriteCond %{HTTP_USER_AGENT} !(msnbot¦slurp¦googlebot) [NC]
RewriteRule .* - [S=2]
#
# case: leading and trailing parameters
RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC]
RewriteRule (.*) $1?%1&%2 [R=301,L]
#
# case: leading-only, trailing-only or no additional parameters
RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$¦^osCsid=[0-9a-z]+&?(.*)$ [NC]
RewriteRule (.*) $1?%1 [R=301,L]
Thank you for any input...
User-agent: Googlebot
Disallow: *sort=
Disallow: *osCid=
and so on.
Once Google had dropped the vast majority of the "duff" URLs from their index, it was easy to then see the final few that needed to be handled by altering the site scripting.