Forum Moderators: Robert Charlton & goodroi
Many of the pages on my site are scripts that parse $PATH_INFO for arguments, as this allows us to avoid GET-method args like?a=1&b=2. So, for example, the URL to a page about blue shirts might be:
example.com/shirts/mens/blue
In this architecture, 'shirts' is a script, and it uses the 'mens/blue' information to figure out what content to display. So far, so good.
The question is whether a request for 'shirts' (no slash) should redirect to 'shirts/'? If 'shirts' were a directory on the filesystem, Apache would do this, but in this case it's a script that returns a 200 OK header and (duplicate) content for both requests.
Matt Cutts has said that "Search engines can do things like keeping or removing trailing slashes" (see his blog entry SEO advice: url canonicalization [mattcutts.com]) but it's not clear to me whether Google would consider my two URLs as distinct:
example.com/shirts
example.com/shirts/
In another case, we use the same architecture, but one uses a 302 to redirect to the other.
Neither of these scripts ranks well. I'm now wondering if the cause is that google has applied a duplicate-content penalty?
[edited by: tedster at 10:31 pm (utc) on Sep. 21, 2007]
[edit reason] add link for accreditation [/edit]
One thing does confuse me, just a bit - why would you expect a url for the script without any parameters to rank well? Does the script still generate a default page that's a kind of high level summary of the category?
example.com/shirts refers to a page named "shirts"
example.com/shirts/ refers to the index of a directory named "shirts"
If you find either of those descriptions compelling in the context of your site's architecture, then it's a good idea to pick the corresponding format.
The choice you make can also affect how the client (browser) resolves relative links on your pages. For example, a page-relative link on the example.com/shirts page will resolve to the named object in your Web root directory at example.com/<object>, while a page-relative link on the example.com/shirts/ index page will resolve to the named object at example.com/shirts/<object>.
So this is far from a purely-cosmetic decision.
Jim