phranque - 1:54 am on Jun 18, 2011 (gmt 0) [edited by: phranque at 2:27 am (utc) on Jun 18, 2011]
I don't recall ever having had duplicate content problems with non-HTML doctypes.
all it takes is an IIS server configured to be case-insensitive and you now have (2**N - 1) non-canonical urls where N is the number of alphabetic characters in the url path and file name/extension.
it's a very common - here's an example from a .gov site:
that's 131,071 non-canonical urls.
added: the google blog post that announces this (linked above) gives an example of a pdf document that is a duplicate of an html document where the html doc is the canonical url.
[edited by: phranque at 2:27 am (utc) on Jun 18, 2011]