Welcome to WebmasterWorld Guest from 23.20.223.88

Forum Moderators: goodroi

Message Too Old, No Replies

block dynamic urls with product options using robots.txt?

How do I?

     
6:06 pm on Feb 23, 2009 (gmt 0)

10+ Year Member



Hey All;

I have a client who's CMS adds option codes to the url when a product option is selected by a user.

Example:
- this is the product page:
mysite.com/V2/productdetails.php?id=1142 (where id is the product id)
and when a user selects a product option(s), the page reloads with the new url, like so:
mysite.com/V2/productdetails.php?id=1142&options=991,994,996,854

The problem is that Google is counting/indexing each option as a new page, and thus is seeing duplicate title tags, and description meta tags.

How do I configure my robots.txt FILE to allow the main product details page, but disallow the versions with options selected? Here's the catch - there are over 1500 product details pages, so "?id=####" ranges from ?id=0001 to ?id=1500 - so adding 1500 lines to my robots.txt file is a bit out of the question...

One solution we have is to change the robots meta tag to "noindex" when an option is selected, but I'd like to do it with the robots.txt file as well...

Thanks in advance!

3:53 pm on Feb 24, 2009 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Google allows for wildcards aka pattern matching in robots.txt. Wildcards are NOT officially part of robots.txt protocol but are supported by the the big three search engines.

Looking at your urls it seems that &options= appears in all of the urls you want blocked and only in them. If that is the case you can use wildcards in your robots.txt to tell Google not to index any url that contains &options=

User-agent: *
Disallow: /*&options=

[google.com...]

 

Featured Threads

Hot Threads This Week

Hot Threads This Month