Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Duplicate Content due to dynamic URLs... penalized?

         

maha

11:37 pm on Jul 13, 2015 (gmt 0)

10+ Year Member



If my ecommerce site has these 3 pages due to the dynamic URL's, they're obviously duplicate pages/content. However, only "http://mysite.com/product1508/product_info.html" is in the Google index. the other 2 dynamic URL's are not indexed.

http://example.com/product1508/product_info.html

http://example.com/product1508/product_info.html?osCsid=4e9c98d834c6174bc9374f7fd8c86b03

http://example.com/product1508/product_info.html?osCsid=89fb10dc2a5ab205b47dc2c3cae4ecaf


In fact none of my dynamic URL's are indexed at all by Google, only the static pages.

My questions:

1) Am I penalized for having 3+ identical pages due to the dynamic URL's?, even though they're not indexed?
According to Google, it knows how to handle these dynamic URL's: [support.google.com...]

2) can a site have duplicate content penalty if the content/pages are not indexed?

Thanks guys..

[edited by: Robert_Charlton at 5:48 am (utc) on Jul 14, 2015]
[edit reason] Changed example domain to example.com to disable autolinking [/edit]

aakk9999

12:50 pm on Jul 14, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



1) No, you have not been penalised, Google has picked up one of these URLs and it recognises the other two are duplicates so it filtered them out. Google is getting better and better of identifying such duplicate URLs

2) There is no duplicate content penalty. But you could have ranking dissadvantages, especially if you internally link to different versions of URL (e.g. spliting page link juice to more than one URL, Google needing to crawl more URLs from your site and therefore using crawl budget on less important pages and so on).

It has also happened that sudden huge influx of duplicate URLs tanks the site - it could be because it alters the relative importance of pages within the site or for google losing the trust in the site or some other reason.

Do you use rel=canonical on your pages? If you do, do all three URLs display the same canonical (the one without the query string parameter?). If you use canonical and it is set up properly then in your case this is all that you need.

netmeg

12:51 pm on Jul 14, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Am I penalized for having 3+ identical pages due to the dynamic URL's?, even though they're not indexed?


Not in my experience. I assume you are blocking the Session ID URLs from being crawled in robots.txt (or have told Google in Search Console what the URL parameters are) Are you sure those are the *only* URL parameters you are dealing with? any sorting, pagination or other URL parameters as well? (That looks like OSCommerce)

maha

1:19 pm on Jul 14, 2015 (gmt 0)

10+ Year Member



This is OsCommerce.

session ID are not blocked in robots.txt.

"rel=canonical" not so simple with OsCommerce.

netmeg

4:45 pm on Jul 14, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You have to use an add-on (in Magento, there are extensions) but it can be done. I'd definitely look into that.

maha

5:09 pm on Jul 14, 2015 (gmt 0)

10+ Year Member



It doesn't seem like it's an big issue from you guys replies. it looks like Google knows how to handle these duplicate URL's. I'd probably just leave it alone, I may do more harm by fooling with the script.. ;-)

Just to make sure I understand it correctly If these dynamic pages with "?" are not in Google index, In fact, none of my dynamic "?" pages are in Google index. I also don't have any internal links to these dynamic URL's, I should be okay? No penalties I need to worry about.

Johan007

5:16 pm on Jul 14, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Personally I would not take the chance if you are already hit by Panda.

maha

5:25 pm on Jul 14, 2015 (gmt 0)

10+ Year Member



I have not been hit by Panda, somebody just pointed out the extra URL's.

netmeg

12:33 pm on Jul 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Personally I'm of the opinion (particularly with ecommerce sites) that it's just as important what you keep out of Google as what you allow in. Plus I don't trust Google to get it right. So as a matter of course, I block crawl of URL parameters in my robots.txt; I use the URL Parameter section in Search Console (GWT) to tell Google what to do in case they do encounter a parameter, and I NOINDEX any thin or no-content pages. For example, all shopping cart and account pages are NOINDEXed, because unless you're logged in, those pages are blank. No reason for them to be in Google. Things like tag pages are blocked. And so on.