Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Avoiding Duplicate Content when there are Currency Choices

         

fishfinger

3:28 pm on Nov 22, 2006 (gmt 0)

10+ Year Member



I've also posted this in the php section.

I'm optimising a shopping cart with currency options. It's been propgrammed so that if someone switches currency to dollars the page is reloaded with?currency=USD on the end and?currency=GBP when they switch back.

This will affect every page in the site.

As they use images for the links to do this I can put a window.open javascript link inside the img tag and it is non-spiderable.

But is there a better way to do this?

whoisgregg

3:53 pm on Nov 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



< This message was moved from another location >

It would be better to store their currency preference in either the session or a cookie.

[edited by: tedster at 4:03 pm (utc) on Nov. 22, 2006]

tedster

4:06 pm on Nov 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would also use robots.txt to disallow all but one version of the page. With both Google and Yahoo now allowing wildcards in the robots.txt this should be easy to do.

helpnow

6:10 pm on Nov 22, 2006 (gmt 0)

10+ Year Member



Yeah, you need to disallow one variation.

In fact, disallow both variatiosn so that younever have the problem in the future should you forget about this and decide to add another currency...

So, simply disallow both the paramater?currency= from your URLs.

And, keep going with this idea, and make sure there are no _other_ parameters in the URL that may also lead to duplicate content. Get all paramters out of the URL except for the ones that point to the product or category - get everything else out, either by hiding them in a cookie, or by rewriting the URLs on the fly with httpd.conf and thus omitting them from the URL if it is a SE robot that is browsing your site...

I speak from experience - this is exactly the sort of thing that killed us for months. Thnak god you realize this now before you get going on it and implement it site-wide... ; )

g1smd

6:40 pm on Nov 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes. This sort of duplicate content is a big problem. Get it fixed before it becomes an issue.

I know of a site that shows no prices to bots, and no prices when you first arrive at the site.

You only get to see prices after you select which ISO 4217 currency you actually want to see them in.

A side benefit of this, is that the Google and Yahoo cache never show out of date pricing: they never show any pricing in the cache.

whoisgregg

6:43 pm on Nov 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Since this is now in the Google Search forum, it's appropriate to go a little more in-depth on the reasoning behind taking this out of the URL. :)

Let's imagine that you have all the different currencies (we'll call them generically "user preferences") in the URL string. So, one page will have these duplicate versions:

/productXYZ.php?currency=USD
/productXYZ.php?currency=GBP
/productXYZ.php?currency=JPY
...

We expect (correctly) that only one of these versions will rank well because the others will be flagged as duplicate (since the only difference in these pages is currency). However, another significant problem is that incoming links to these different pages will not count toward the link popularity of the single page that ends up appearing.

So, let's say you have 100 links to USD, 50 links to GBP, 20 to JPY, etc... You could have had the link power of 170+ links all pointing to just the one true product page!

If you exclude the alternate versions with robots.txt you are actually losing the benefit of links pointing to those alternate versions.

The only time that you should have different currency versions specified in the URL is when you are also providing unique content to that market. (Like, content translated to that countries language or unique shipping details.)

If you want to give people the ability to link to a particular currency version, you can give them the currency=USD query string... BUT have the page use that request to set the currency cookie/session parameter then do a 301 redirect to the page without the query string.

helpnow

8:20 pm on Nov 22, 2006 (gmt 0)

10+ Year Member



I concur - forget robots.txt as a solution - doesn't really 'fix' the problem because it introduces other issues.

Just use htaccess and / or http.conf where applicable, and be done with it! ; )

(We avoid the use of cookies because many (read: lots and lots) people have cookies turned off, and you will lose them if you depend on cookies for anythign important.)

tedster

9:14 pm on Nov 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is a superior resolution for the challenge. The robots.txt solution I proposed does have some holes in it.

fishfinger

9:40 am on Nov 23, 2006 (gmt 0)

10+ Year Member



Thanks for all the replies guys; I don't know if there will be many links coming directly into the product pages themselves, but as the switch currency command is on every page in the site this could well be an issue on the home page and other top level pages.

I'm not a programmer but I think it will be less work to write a generic rule to block currency extension urls for lower level pages and product pages (of which there are a few thousand) and then use 301s for the top level ones (there are only a few of these).

Can anyone point me to a good tutorial on blocking access to the end of a url?

helpnow

2:10 pm on Nov 23, 2006 (gmt 0)

10+ Year Member



In addition, I think it is important to hide the switch currency link on every page from robots, so they just don't see it and don't try to resolve the URL. Same as session ids, just hide them so they aren't seen and robots try to resolve them and then basically crash your server trying to resolve a gazillion "different" URLs.

g1smd

9:05 pm on Nov 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, try to make the links to be not followed by hiding them from bots in some way, but if they are followed, make sure that the target page explicitly says noindex too.

whoisgregg

10:53 pm on Nov 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you have an include file that is part of every page of your site already, programming it for every page is a snap. Pseudocode below:

session_start(); // make sure the code below is after session_start
if( isset($_GET['currency']) ){
if( !isset($_SESSION['currency']) ¦¦ $_SESSION['currency'] != $_GET['currency']){ // we only redirect if the preference isn't set yet. If we redirected every time, users wouldn't be able to get to the specific currency link in order to share it
$_SESSION['currency'] = $_GET['currency']; // set the session variable
$requestUri_exploded = explode("?", $_SERVER['REQUEST_URI']); // get rid of the query string
header("HTTP/1.1 301 Moved Permanently"); // 301
header("Location: http://www.example.com".$requestUri_exploded[0]); // new URL, sans query string
die;
}
}

To make this code complete, you'd want to check for https and take into account other query string variables that should be appended to the redirect url. But otherwise, this is basically it.

Patrick Taylor

11:17 pm on Nov 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In cases like this, with parameters in the URLs, I've used (in the document head):

$parameter = $_GET['parameter'];

if (strlen($parameter) > 0) {
$robots = '<meta name="robots" content="noindex" />';
} else {
$robots = false;
}

echo $robots;

[edited by: Patrick_Taylor at 11:21 pm (utc) on Nov. 23, 2006]

g1smd

11:17 pm on Nov 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Good post.

fishfinger

9:31 am on Nov 24, 2006 (gmt 0)

10+ Year Member



thanks very much for all the replies!