| 5:20 pm on Jun 29, 2012 (gmt 0)|
RewriteRule ^Product/page/(.*)$ /product/page/$1 [R=301,L]
| 5:36 pm on Jun 29, 2012 (gmt 0)|
Add the protocol and domain name to the rule target.
Make sure this rule is before your non-www/www redirect.
If the (.*) bit is always a number, use ([0-9]+) instead. No need to redirect duff requests.
You might need a separate rule before this to handle page 1 properly.
| 11:23 pm on Jun 29, 2012 (gmt 0)|
I had a similar issue recently and I will say this first, Internet protocol determines that website addresses and urls should not be case sensitive. The best solution here is to show both the upper case (or mixed case) and lower case version of the page. No redirect should be required at all.
| 11:49 pm on Jun 29, 2012 (gmt 0)|
The hostname should not be case sensitive.
Paths and filenames are case sensitive.
Apache gets this right. IIS server software and badly coded CMS software gets it wrong.
| 4:03 am on Jun 30, 2012 (gmt 0)|
|The best solution here is to show both the upper case (or mixed case) and lower case version of the page. No redirect should be required at all. |
That causes canonicalization problems in the search engines as those are two distinctly different paths showing the same content.
The redirect to the preferred path solves that problem.
| 6:39 pm on Jul 1, 2012 (gmt 0)|
|That causes canonicalization problems in the search engines as those are two distinctly different paths showing the same content. |
I have to say first I have NEVER seen a canonical issue with the SAME PATH using capitalization in urls.
A static url would show the same content with capitalization and as rewrites choose to emulate static urls so should they.
|Apache gets this right. IIS server software and badly coded CMS software gets it wrong. |
I totally agree
| 6:42 pm on Jul 1, 2012 (gmt 0)|
|I have to say first I have NEVER seen a canonical issue with the SAME PATH using capitalization in urls. |
I have. I'm in the middle of dealing with a site that has 80 000 duplicate content URLs, a sizeable proportion of them caused by casing issues.
| 6:50 pm on Jul 1, 2012 (gmt 0)|
g1smd if thats the case then the SE is at fault. The Uniform resource locator should not be case sensitive as outlined by Tim Berners Lee. Static urls are not case sensitive so why on earth should rewrites?
| 8:28 pm on Jul 1, 2012 (gmt 0)|
[w3.org...] states that hostnames are not case sensitive, but the rest of the URL is.
|The Uniform resource locator should not be case sensitive as outlined by Tim Berners Lee. |
Perhaps you can point out where that is stated.
| 8:56 pm on Jul 1, 2012 (gmt 0)|
I always use lc urls
| 9:23 pm on Jul 1, 2012 (gmt 0)|
|A full BNF description of the URL syntax is given in Section 5. |
In general, URLs are written as follows:
A URL contains the name of the scheme being used (<scheme>) followed
by a colon and then a string (the <scheme-specific-part>) whose
interpretation depends on the scheme.
Scheme names consist of a sequence of characters. The lower case
letters "a"--"z", digits, and the characters plus ("+"), period
("."), and hyphen ("-") are allowed.For resiliency, programs
interpreting URLs should treat upper case letters as equivalent to
lower case in scheme names (e.g., allow "HTTP" as well as "http").
This to me says anything following the colon should not be case sensitive.
| 9:28 pm on Jul 1, 2012 (gmt 0)|
NO. It's talking about "scheme names", allowing http:// HTTP:// and HttP:// as equivalent.
In http the hostname is also case insensitive.
In http the path and file are case sensitive.
| 9:35 pm on Jul 1, 2012 (gmt 0)|
The great thing about forums is ability to share and learn. I had always interpreted this as all urls but g1smd has made me rethink this. Its an example of a shared opinion opening up a previously fixed opinion.
Thanks g1smd for making me see this from a new perspective.
| 9:38 pm on Jul 1, 2012 (gmt 0)|
mixed-case paths are an excellent method of detecting lame bots.
| 10:00 pm on Jul 1, 2012 (gmt 0)|
Good habit. But the most important thing is to treat everything as if it's case-sensitive, even if your current server (or your home computer) doesn't care.
All subsequent posts came in after I opened the tab, so there may be overlap.
How 'bout this [w3.org]? (Boldface in the original; note final sentence. Within this selection, text is continuous.)
|When is a URI "the same URI"? |
Two URIs are the same if (and only if) they are the same character for character.
Two URIs which are different may in fact be equivalent, in that they may refer to the same thing, and give the same result in all operations. In some cases any agent looking at two URIs can deduce, from knowledge of the various web standards, that they must be equivalent, in that they must refer to the same thing. For example, HTTP URIs contain domain names, and the Domain Name System is case-insensitive. Therefore, while it is normal practice to use lower case for domain names, any agent which comes across two URIs which differ only in the case of the domain name can conclude that they must refer to the same thing. In another case, a client agent may use out-of-band information about a web site to know that its URI paths are case-invariant, or that URIs ending in "/" and "/index.html" are equivalent. It is bad engineering practice to make new protocols require such processing.
There are a long series of such algorithms. Which ones an agent can apply depends on what information it has to hand, and depend on what knowledge of which protocols has been programmed into it. New schemes may be defined in the future, for which different forms of canonicalization can be done. There is, therefore, no definitive canonicalization algorithm for URIs. Generic URI handling code should handle URIs as case-sensitive character strings.
| 4:09 am on Jul 2, 2012 (gmt 0)|
|Its an example of a shared opinion opening up a previously fixed opinion. |
The canonicalization issue isn't an opinion, it's a fact.
Implemented correctly, this will not be a problem.