Forum Moderators: phranque
I have a store where I can get to the same product via two methods - by design or by product type - and I end up with the same content on two different pages. The only difference is in the URL - the directory name. I am concerned that I'll get penalized for duplicate content from Google so I thought about one in my robots.txt file. I'm not sure how to do this though without some help.
My store is located in a sub-domain:
http:// subdomain.domain.com
The sub-domain is physically located off my domain as a folder. The sub-domain points to this folder.
I only have one CGI bin off the domain but it is linked to the sub-domain.
The URLs are as follows:
http:// subdomain.domain.com/cgi-bin/store/shop.cgi/product_type/product1
http:// subdomain.domain.com/cgi-bin/store/shop.cgi/product_design/product1
To ban one directory do I put the following in my Robots.txt file?
User-agent: *
Disallow: /cgi-bin/store/shop.cgi/product_type/
Will this disallow the whole CGI directory?
Also, when I set up the html pages in the sub-domain sub-directory I placed a Robots.txt file in there as well. I assume this is the one I would add this to.
Thanks for any help.
There is no such thing as penalties for duplicate content.
If there was, why are so many people aggregation rss feeds? what would the purpose be to using rss feeds if it was going to damage your serp?
what about blog entries? a blog entry made today & posted in 4 categories in your blog is now duplicated several times. On the index, on the permalink, in the monthly archive, in each of the categories you selected, in the feed.
Does amazon.com get penalized for duplicate content when the same book comes up under author & publisher, and title. No.