|Sort by Price, Size, etc.|
what to do about duplicate content?
I suppose I should have done this long ago but it doesn't seem to be an issue with the larger ecom sites that use "sort by" pages that all have the same titles and descriptions so I wasn't concerned with it. But then again, that's them and not me. I figure it's time to fix it, but which method to use?
It's fairly simple to change the titles or meta tags with php. The question is, which one to change? Is it better to have:
widgets-blue.html-price-ASC<title>Blue Widgets by Price</title>
widgets-blue.html-size-ASC<title>Blue Bidgets by Size</title>
or, just add a robots="no index" in the mete tags for anything other than widgets-blue.html?
The first method will obviously give me three times as many pages but what good is it if they cause a duplicate content problem? But than again, it's not really duplicate content, or is it? I find it strange this hasn't been covered in here before.
What do you see that makes you feel you have a problem that needs fixing?
Sorry Tedster, maybe I'm not explaining this well. If you have a dynamic page in which the user is able to sort products by price, size and title, you basically end up with four different URL's which are all slight variations of the same page and also have the same title and description, ie: duplicate content.
www.blue-widgets.html <title>Buy Blue Widgets
www.blue-widgets-price.html <title>Buy Blue Widgets
www.blue-widgets-size.html <title>Buy Blue Widgets
www.blue-widgets-style.html <title>Buy Blue Widgets
Question is, is it better to change the titles and descriptions for each or to just put in a noindex tag for anything other then just blue-widgets.html?
I'd put in the noindex. Only changing the titles doesn't fix the duplicate content issue.
If you have a page with 100 items and 10 words each you have 1,000 words. Now if you sort it a different way then you have 1,000 of the same words, but in a completely different order. I would think G would be hard pressed to view that as duplicate content.
I think I was the one who wasn't clear enough.
For a site that is just now instituting sort pages, I agree that restricting which urls get indexed can be wise, especially if the site doesn't have really strong PR. However, your site already has the sort pages in place, correct? I've seen lots of people try to 'fix' something because of something they read, and they end up hurting their business.
So I was wondering if you see some signs that your sort pages are in some way problematic. In other words, Google may already be handing your urls quite nicely for you. If so, you might cause a problem when you're just trying to make something better.
On the other hand, if there are symptoms of trouble -- then yes, I agree that a simple title change is probably not enough to fix things and you will want to exclude the sort pages from being indexed to see if that helps.
[edited by: tedster at 11:24 pm (utc) on Dec. 20, 2006]
|If you have a page with 100 items and 10 words each you have 1,000 words. Now if you sort it a different way then you have 1,000 of the same words, but in a completely different order. I would think G would be hard pressed to view that as duplicate content. |
Maybe if all 1000 words in a completely different order. But you wouldn't have that- you'd have 100 groups of 10 words in a different order. It's not that difficult to match 10 words.
|Maybe if all 1000 words in a completely different order. But you wouldn't have that- you'd have 100 groups of 10 words in a different order. It's not that difficult to match 10 words. |
If G is going to start penalizing two pages for having 10 words in the same order then there's not going to be very many pages in it's index ;)
I do see what you're saying though.
[edited by: ALbino at 11:29 pm (utc) on Dec. 20, 2006]