Forum Moderators: Robert Charlton & goodroi
When looking at the "Content Drilldown" section, I see one subdirectory listed two ways:
/imported_widgets/ and /Imported_Widgets/
I am on a Windows server so any URL's containing either the capitalized or all lower case spelling resolve to the exact same page.
When I first started this website I used the "/Imported_Widgets/" format. Later on (about 7 years ago) I switched to the all lower case form. I have to assume (as best as I can determine from Analytics) that the current in coming requests for the miss-capitalized pages are due to links on other websites created before I changed to all lower case.
My question is simply this. Since I am on a Windows server both forms of incoming capitalized/uncapitalized page requests get served the same exact page, so I don't have any "page not found" problems. But is there some way that the GoogleBot evaluates inbound links from other sites like this and as a result feels that it has found some sort of duplicate content issue (two differently capitalized URL's with the same content)?
I was surprised to see that Google Analytics saw these as different pages and it got me wondering if the Google SERP's did too.
[edited by: tedster at 11:46 pm (utc) on Aug. 16, 2009]
[edit reason] switch to "Widgets" instead of real products [/edit]
Although Google attempts to know when to combine such variation, this is a tough job and they often don't get it right. Windows servers are a bear for dealing with this kind of canponical problem because Microsoft has refused to be part of the web standard that capitalization actually does matter. If you use a thrid party module for IIS, such as ISAPI Rewrite, you have functionality that approaches an Apache server - and that's a lot more satisfactory.
I'd suggest at the very least using the canonical tag on these pages, with all lowercase formatting.
I have always said that you need to enforce your url's. Every page should be aware of itself and make sure it is only called one way. The new rel="canonical is supposed to take care of this. Every time people talk about this they mention extra parameters or bad url rewriting but I have never hard anybody talk about how it affects Capitalization.
The canonical tag is the second best option. It will fix entries in some searchengines databases, but it will not stop users from continuing to see incorrect URLs in the browser URL-bar when they follow 'incorrect' links to the site.