Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: goodroi
For example if the urlencoded URL http://example.com/foo/widgets%3A%20blue.html was in a site map, Google would follow it as http://example.com/foo/widgets:%20blue.html.
In order to prevent duplicate content penalties and in an effort to try and concatenate all pages to a single page I had coded a 301 redirect from URI requests containing ':' to URIs using '%3A'. This threw Google into a circular redirect as Google's bot would still make its request using ':'.
My method for dealing with this issue has been to stop redirecting requests with URIs containing ':' to URIs using '%3A' and instead using the following in my HTML header:
<link rel="canonical" href="http://example.com/foo/widgets%3A%20blue.html">
Now I have to wait a week or two for those errors to clear themselves out of my WMT crawl error log. grrrr....
Like it or not we are *not* free to use 'just any' characters anywhere we want in URIs.
It is actually a chemical database, with each chemical name being the "filename". This lead to some really messed up file names, but like I said this section of my website is almost ten years old so there is a limit to what I can do to fix things without taking some serious SERP hits.
Remember way back when I added this section of my site, query strings weren't treated as nicely by search engines as were "real" web pages so it was important to rewrite queries into the main URI. Even today there is debate whether one should rewrite queries into the main URI.
As they say, don't fix what ain't broke. The stupid %3A issue was a break that had to be fixed, but that wasn't a RewriteRule problem.