Welcome to WebmasterWorld Guest from 54.163.65.30

Forum Moderators: incrediBILL

Message Too Old, No Replies

Capitalized encoded characters causing issues

when site is set to 301 URLs to lowercase

     
5:44 pm on Feb 26, 2013 (gmt 0)

New User

joined:Feb 22, 2013
posts:4
votes: 0


Hi everyone,

Hopefully someone can help me understand what's happening here.

I have a site where all URLs were recently 301 redirected to a non-capitalized version of the URL.

A lot of the internal URLs on the site have encoded characters like %2F - and we realized this was causing all URLs to redirect to the %2f version.

We found Google Webmaster tools complaining about an 'increase in not followed pages'

When I do a 'fetch as googlebot' on the capitalized %2F, the result is a 301 to %2f, all well and good, but then if I copy and paste the lowercase %2f URL and do another fetch as Googlebot again, the grid that shows my requests lists it as %2F (capitalized), and shows another 301 to %2f lowercase.

There must be something about capitalization and character encoding I don't understand...
10:14 pm on Feb 26, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13441
votes: 390


example.com/page
and
example.com/Page

are different URLs.

%2F and %2f are the same character. But not all functions can disencode both.

Google probably has its own internal function that regularizes [a-f] in encodings to [A-F]. (My logs apparently do the same-- or possibly it's the server itself-- because lower-case letters don't seem to occur.)

You shouldn't have encodable characters in the path of your URL anyway. Do you, or are they coming in from query strings? If so, I hope it isn't really %2F since that is the / slash, a character that doesn't belong in a query string. Either way, the path and the query should be handled separately.
10:31 pm on Feb 26, 2013 (gmt 0)

New User

joined:Feb 22, 2013
posts:4
votes: 0


Hi Lucy24,

Thanks very much for your help.

Welp, in this case it's a query string which includes an off-site URL as it's a redirect target.

like, /redirect?url=http%3a%2f%2f etc.

So... that's a query string, right?
10:46 pm on Feb 26, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member swa66 is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 7, 2003
posts:4783
votes: 0


/redirect ... sure whatever you have doing the "redirect" isn't actually working ?
Is it validating where it redirects to ?
e.g. avoiding redirecting to itself ?

A 301 response from a webserver is a redirect.
11:21 pm on Feb 26, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13441
votes: 390


Overlapping swa because I detoured to look up some stuff.

Yup, that's a query string. And I guess you're stuck with the slashes :)

Since your de-capitalization is concerned only with your own URLs, there should be no need for your off-site redirect page to get involved. It may even create errors if you're changing the capitalization of some other site's page names.

But at this point it's no longer an html question but some combination of-- probably-- php and apache.

The quickest temporary fix is to put an exclusion in your redirect code that skips anything in the form %\h\h (expanded to [\dA-F] or even [0-9A-F] if your RegEx dialect doesn't do \h). Exact mechanics will again depend on the exact form of the redirecting function.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members