GSC reports "redirect error" in URLs with special characters
menntarra 34
3:12 pm on Jan 3, 2020 (gmt 0)
One of my websites has urls wich contain these characters: * ² ° These urls are working fine, not redirecting or anything, but Google Search Console lists them as "redirect errors" Why is this happening?
Well the problems reported are different but it might be a false positive too just as in your linked topic.
Anybody else ever experienced search console reporting problems with special characters in url?
not2easy
2:29 pm on Jan 4, 2020 (gmt 0)
Their "new" GSC has had false positive reporting errors since it rolled out months ago.
Most of us try to avoid using special characters in URLs simply because they have a higher possibility of being altered by the device viewing them. You need to carefully declare your pages' "charset" or their device may parse those characters to something different than what you are seeing. If your charset is anything other that a universal set or is mismatch with the page's declaration then devices or robots can parse those special characters into peculiar alternatives.
menntarra 34
3:33 pm on Jan 4, 2020 (gmt 0)
Thank you for your response. I tried to take care of such problems. - Page charset is set to UTF-8. - URL encoded value is being redirected to the urldecoded value: Example: %2 redirect-> * domain.com/something%2something redirected to the urldecoded page to avoid duplicates: domain.com/something*something
However GSC marks this pages with redirect problems, which makes no sense: domain.com/something*something
I tested several other tools to check if redirection works okay, everything seems fine. Also checked all mayor browsers, fine in those as well. I guess i just have to forget about these, even though it is annoying.
menntarra 34
3:44 pm on Jan 4, 2020 (gmt 0)
Well it is still not a solution cause google is not including these error pages into its search, anyway i will change my url structure to not allow characters like these, see no other option right now.
lucy24
6:04 pm on Jan 4, 2020 (gmt 0)
Uhm, the charset declaration has nothing to do with the URL. (It does affect display of the title, which is why you need to be sure to put your charset declaration before the <title> tag if it contains any non-ASCII characters.) The charset information--whether in-page or in the config file--is read by the browser at time of page load, at which point the URL used to reach the page is already a thing of the past.
not2easy
6:47 pm on Jan 4, 2020 (gmt 0)
My understanding of the charset meta is that the browser uses it to parse the text. The problem here is not about the visual aspect, more related to how Google's robots might deal with it. Many years ago I got notes from google about my charset (waaay back when) which is why I think that their bots use it for their own parsing.
lucy24
7:05 pm on Jan 4, 2020 (gmt 0)
Sure, but that’s parsing of the page content. The original question was about URLs. You have to get past the URL to see (or robotically interpret) whatever is on the page.
Edit: I just had a thought. If an URL with unusual characters meets a canonicalization redirect--whether www or https--the response will include those unusual characters as some form of literal text. You might then run into encoding issues if the server runs on a different encoding than the one you used to created your config/htaccess, or if something gets translated the wrong way en route. In the specific case of mod_rewrite in Apache, an [NE] flag might be warranted.