Forum Moderators: phranque
I use oscommerce for my store. I have a contribution which transforms the dynamic urls into static pages. this change takes place mainly in the .htacess file, which i have included below this message. However, these pages may be sorted by price, etc. and there are multiple pages, causing a sort on the page like thispage.html?page=2 or something like that.
I want to have all spiders revert back to the base url, for example - thispage.html - when encountering any page like "thispage.html?whatever"
I think i have the code correct for this in the htaccess, but am not sure if it will conflict with the code above it. i dont want to get caught in a loop, where google will not obey my original rewrites to the static pages.
Could someone check out my htaccess for me
Thank you so much in advance!
DirectoryIndex index.php
AddType text/html asp
ErrorDocument 404 /
ErrorDocument 403 /v-web/errdocs/403.html
ErrorDocument 401 /v-web/errdocs/401.html
ErrorDocument 500 /v-web/errdocs/500.html
ErrorDocument 400 /v-web/errdocs/400.html
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteRule ^(.*) http://www.example.com/$1 [L,R=permanent]
#change dynamic pages to static html
RewriteRule ^(.*)-p-(.*).html$ product_info.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-c-(.*).html$ index.php?cPath=$2&%{QUERY_STRING}
RewriteRule ^(.*)-m-(.*).html$ index.php?manufacturers_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pi-(.*).html$ popup_image.php?pID=$2&%{QUERY_STRING}
RewriteRule ^(.*)-t-(.*).html$ articles.php?tPath=$2&%{QUERY_STRING}
RewriteRule ^(.*)-a-(.*).html$ article_info.php?articles_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pr-(.*).html$ product_reviews.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pri-(.*).html$ product_reviews_info.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-i-(.*).html$ information.php?info_id=$2&%{QUERY_STRING}
# redirect search engine spider requests which include a query string to same url with blank query string
RewriteCond %{HTTP_USER_AGENT} ^FAST(-(Real)?WebCrawler/¦\ FirstPage\ retriever) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Gigabot/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot(-Image)?/[0-9]\.[0-9]{1,2} [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mediapartners-Google/[0-9]\.[0-9]{1,2} [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/.*(Ask\ Jeeves¦Slurp/¦ZealBot¦Zyborg/) [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot/ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Overture-WebCrawler/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Robozilla/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^(Scooter/¦Scrubby/) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teoma
RewriteCond %{query_string} .
RewriteRule (.*) http://www.example.com/$1? [R=301,L]
[edited by: jdMorgan at 11:50 pm (utc) on Oct. 18, 2005]
[edit reason] Example.com [/edit]
# change dynamic pages to static html
RewriteRule ^([^-]+)-p-([^.]+)\.html$ product_info.php?products_id=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-c-([^.]+)\.html$ index.php?cPath=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-m-([^.]+)\.html$ index.php?manufacturers_id=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-pi-([^.]+)\.html$ popup_image.php?pID=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-t-([^.]+)\.html$ articles.php?tPath=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-a-([^.]+)\.html$ article_info.php?articles_id=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-pr-([^.]+)\.html$ product_reviews.php?products_id=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-pri-([^.]+)\.html$ product_reviews_info.php?products_id=$2&%{QUERY_STRING} [L]
RewriteRule ^([^-]+)-i-([^.]+)\.html$ information.php?info_id=$2&%{QUERY_STRING} [L]
#
# redirect search engine spider requests which include a query string to same url with blank query string
Rewritecond %{THE_REQUEST} ^[A-Z]{3,9}\ /(product_info¦index¦<...>¦product_reviews_info¦information)\.php\?[^\ ]+\ HTTP/
RewriteCond %{HTTP_USER_AGENT} ^FAST(-(Real)¦WebCrawler/¦\ FirstPage\ retriever) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Gigabot/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot(-Image)?/[0-9]\.[0-9]{1,2} [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mediapartners-Google/[0-9]\.[0-9]{1,2} [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/.*(Ask\ Jeeves¦Slurp/¦ZealBot¦Zyborg/) [OR]
RewriteCond %{HTTP_USER_AGENT} ^msnbot/ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Overture-WebCrawler/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^Robozilla/ [OR]
RewriteCond %{HTTP_USER_AGENT} ^(Scooter/¦Scrubby/) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teoma
RewriteRule ^([^.]+\.php)$ http://www.example.com/$1? [R=301,L]
In the second ruleset, a method must be used that will discriminate between requested URLs and rewritten URLs, in order to prevent the second ruleset from rewriting URLs modified by the first ruleset. The key to this is the use of the variable {THE_REQUEST}, which always contains the original client (spider or browser) request. The pattern there should be the entire list of your php pages -- I left some out and replaced them with "<...>" to prevent the line from wrapping on the screen.
Replace all instances of the broken pipe "¦" character above with a solid pipe character before use. Posting on this forum modifies those characters.
Jim