Use rel=canonical metatag.
If you have 2 pages with identical content
and the correct link/page is
on both pages put (do not forget to add http:..)
<link rel="canonical" href="mywidgets.com/product-name-widget-1" />
In this way you tell to search engine that correct version is that without "/" at the end.
google still calls the link rel canonical a "suggestion".
|Adding this link and attribute lets site owners identify sets of identical content and suggest to Google: "Of all these pages with identical content, this page is the most useful. Please prioritize it in search results." |
the most robust solution is to redirect the non-canonical version of the url to the canonical version of the url using a 301 status code.
you simply don't serve a 200 OK and content to requests for non-canonical urls.
this solution doesn't require any guesswork or unnecessary processing by google.
If you want readers to think of it as a page, use the without-slash form. If you want readers to think of it as a directory (frankly this doesn't seem likely), use the with-slash form.
And now go yell at the designers of your CMS for not enforcing a single URL format.
It's just personal preference. Either works, but the key is to allow only one OR the other, never both. And I agree w/ Frank... using 301 redirects to correct the canonicalization issue is a much better solution than using rel="canonical".
If you fix the issue w/ 301 redirects (which have been the preferred way to correct canonical issues a LOT longer than rel="canonical") then users will never again see a non-canonical URL in their browser even when they click on a non-canonical link. They are immediately redirected to the canonical form of the URL and the browser's address bar is updated to reflect the canonical URL. A big advantage of this is that virtually all links created in the future will point to the canonical URL since 99.99% of the links created on the web are created by navigating to the page to be linked, copying the URL from the browser address bar, and pasting it into HTML, code, or a CMS to create a link.
If you fix the issue w/ rel="canonical" then users will likely continue to see non-canonical URLs in their browser. This means they will likely continue to build links to the non-canonical URL. IMO rel="canonical" should be used only as a last resort if you cannot create 301 redirects or if creating redirects to solve the issue is simply too complex (such as is the case on lots of ecomm sites using many query string parameters in the URL and simply changing the order of the parameters - not even the values - creates a new URL.)
Personally, if your URLS are all extensionless, I like having ALL URLs end in "/" instead of ending with what "appears" to be an extensionless filename.
just looks dumb. All internal nodes in the URLs sometimes looks like a file (e.g. "level-1" appears to be a file in /level-1) and sometimes looks like a folder ("level-1" appears to be a folder in /level-1/level-2).
example.com/folder/ for folders
example.com/page for pages.
A URL such as
example.com/stuff/ with a trailing slash gives a clue that there may be more levels.
Make sure the internal navigation always links to the correct form.
Redirect requests for the incorrect form.
I think standard wise, files are supposed to be without a slash and folder with one. Practically, it's a mixed bag to whole follows this and who doesn't, and as far as I can tell Google can tell the difference between a page and a folder (or a category, as it usually is), even if both use a slash.
I should hope that g### knows the difference between an URL and a filepath :) But it can't tell whether two different URLs yielding identical content represent one physical file or two.
We say that final / means a directory. But the word "directory" is really shorthand for "the index file of the specified directory". Even an auto-index isn't really the raw, machine-level index of a directory; it's an html file created to look that way.
>>redirect the non-canonical version of the url to the canonical version of the url using a 301 status code.
can you teach me how to do it ?
The adjustment apply to sitewide, or i have to set it page by page ?
i am a little dumb.
You should add a rule to redirect requests for example.com/folder/index.php (and .htm, .html) to example.com/folder/
When /folder/ exists, a request for example.com/folder should already be automatically redirected to example.com/folder/
You need to add a rule so that when example.com/page/ or example.com/folder/page/ is requested, the request is redirected to example.com/page or example.com/folder/page respectively.
|can you teach me how to do it ? |
There is a recent thread on this exact subject in the Apache Forum, which is a much better place for the "how to" part to be explained. You can find the thread here: [webmasterworld.com...] Feel free to hop into the current thread or start another one and I'm sure someone will try to give you a hand working through things even though we don't usually code for people for a number of reasons.
the trailing slash redirect is partly handled by mod_dir:
The trailing-slash redirect turned out to be part of the problem in the thread jd linked to. Over there, the OP's remove-slash redirect included one URL that was a real, physical directory, so mod_dir kicked in and created an infinite loop. But everywhere else-- and, I think, in the present thread too-- the URL are created out of whole cloth by the CMS.
Ideally you want a pattern of redirects that doesn't involve a !-d check at every stage.