Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

trailing slashes and duplicate content

is it necessary to redirect? or does google treat both as the same page?

         

mcglynn

7:44 pm on Sep 21, 2007 (gmt 0)

10+ Year Member



I believe apache's default behavior for a directory is to redirect from the bare directory name to append the trailing slash, e.g. from:
example.com/directory
to
example.com/directory/
via 301.

Many of the pages on my site are scripts that parse $PATH_INFO for arguments, as this allows us to avoid GET-method args like?a=1&b=2. So, for example, the URL to a page about blue shirts might be:
example.com/shirts/mens/blue

In this architecture, 'shirts' is a script, and it uses the 'mens/blue' information to figure out what content to display. So far, so good.

The question is whether a request for 'shirts' (no slash) should redirect to 'shirts/'? If 'shirts' were a directory on the filesystem, Apache would do this, but in this case it's a script that returns a 200 OK header and (duplicate) content for both requests.

Matt Cutts has said that "Search engines can do things like keeping or removing trailing slashes" (see his blog entry SEO advice: url canonicalization [mattcutts.com]) but it's not clear to me whether Google would consider my two URLs as distinct:
example.com/shirts
example.com/shirts/

In another case, we use the same architecture, but one uses a 302 to redirect to the other.

Neither of these scripts ranks well. I'm now wondering if the cause is that google has applied a duplicate-content penalty?

[edited by: tedster at 10:31 pm (utc) on Sep. 21, 2007]
[edit reason] add link for accreditation [/edit]

tedster

11:37 pm on Sep 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In the great majority of cases, it's been quite a while since I've seen a serious ranking issue on Google generated by the trailing slash - nothing like it used to be. But in the case you described, you might well be suspicious and take some insurance steps.

One thing does confuse me, just a bit - why would you expect a url for the script without any parameters to rank well? Does the script still generate a default page that's a kind of high level summary of the category?

mcglynn

11:47 pm on Sep 21, 2007 (gmt 0)

10+ Year Member



Does the script still generate a default page that's a kind of high level summary of the category?

Yes, exactly.

Patrick Taylor

6:12 pm on Sep 22, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



example.com/shirts
example.com/shirts/

They're surely not the same URL. For consistency I would redirect the second to the first. I know it's common for an URL to end with a trailing slash but to the user, without makes more sense.

jdMorgan

6:51 pm on Sep 22, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



While considering which to choose, this may help. According to the HTTP specs,

example.com/shirts refers to a page named "shirts"
example.com/shirts/ refers to the index of a directory named "shirts"

If you find either of those descriptions compelling in the context of your site's architecture, then it's a good idea to pick the corresponding format.

The choice you make can also affect how the client (browser) resolves relative links on your pages. For example, a page-relative link on the example.com/shirts page will resolve to the named object in your Web root directory at example.com/<object>, while a page-relative link on the example.com/shirts/ index page will resolve to the named object at example.com/shirts/<object>.

So this is far from a purely-cosmetic decision.

Jim