Now, let's try your example and let's be logical about it.
The URL at Google is this...
where 'xxxxxxxx' is the item ID
Using your example, by searching link:www.amazon.com/exec/obidos/tg/detail/-/xxxxxxxx?v=glance comes up with supposedly backlinks for the URL.
In the result, the #1 have Amazon links but none of them is pointing to 'xxxxxxxx'. How did it become a backlink for 'xxxxxxxx'?
#2-#4 are Amazon internal pages
The remaining are Amazon affiliates with this kind of URL structure...
where 'affiliate-20' is a unique affiliate name. Meaning to say, each link pointing to xxxxxxxx is a unique Amazon URL. It could be like...
all leading to the same page. If a spider happened to visit all 3 links above and if the algorithm have a penalty for duplicate URL, the above URLs would have caused a penalty for the product page 'xxxxxxxx'. Of course as human we recognize it as just an affiliate links, no need to penalize the page. But, how can a spider recognize that? Programatically, a software can even flag it for spamming.
To add confusion, when somebody click on the above URL, the user is redirected to this URL...
where '111-111111-1111111' is either a session ID or a cookie. What do we got so far?
One product page = [multiple unique affiliate URL] x [no. of users who actually clicked on those URLs or spiders that visited those URLs] = potentially hundreds of thousands of combinations if not millions of unique URL.
How did Google ends up with just 1 URL then for product 'xxxxxxxx'?
Anybody can argue that Google simply removed the affiliate ID and the session/cookie ID. If that is the case, that implies Google have a special filter for Amazon because any programmer know that it requires hard coding in order to do that. So now we have 101 factors in the algorithm ;)
But even that, does not make any sense at all. Let's look again at Google-Amazon URL.
compare it to an affiliate URL without all the ID's
.../tg/detail/-/ compared to
How did Google arrived at .../tg/detail/-/ when none of the affiliates have that URL structure? Makes you wonder, right?
Even if Google managed to reach the ..../tg/detail/-/ part of Amazon, it still have to clean the Amazon URL of reference ID, session ID and other parameters. Google would do that for Amazon? Wow! How about for the rest of small dynamic sites?
Not only that, Google then bend backwards to find affiliates links that has .../ASIN/xxxxxxxx in the URL and credit it to the .../tg/detail/-/xxxxxxxx URL. Wow and double Wow. A lot of us here with dynamic URL would be jumping with joy if Google would do that for us.
Imagine the manpower and the crunching power that Google employ just to clean Amazon URL and finding the backlinks for it.
Also, is it not that the fact we know here in WW is that when we link to a URL...that would be the 'exact' URL listed at Google's index? Well, Amazon scenario just broke that belief.
So, how did Google do it? You tell me ;)