Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Removing spaces from urls


8:45 pm on Nov 9, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 28, 2001
posts: 779
votes: 0


There are a few sites that seem to link to lots of my content, however they must have some buggy code because they keep inserting a space in the links.

For example, a url like 'test.com/thisisapage' will be linked to like 'test.com/thisisap age'

As a result, they generate a lot of 404's. Getting in touch with the people who own that site doesn't seem to be possible.

I'm not sure if there's something I can put in my conf file, or maybe alternatively some php code in my 404 file, that removes the blank space and does a redirect?

Any advice would be appreciated!
2:21 am on Nov 13, 2010 (gmt 0)

Preferred Member

10+ Year Member

joined:June 10, 2003
posts: 410
votes: 0

madmatt69 --

Sorry this one took so long to get to. In your example, the URL has an actual space character, but I think depending on the browser, requesting the link, it could get turned into a + or a %20, both valid escapes for a space character.

If you can do this in PHP, the code is simple:, e.g.

$str = " fo o+bar%20fubar ";
$str = preg_replace('/(%20|\s|\+)/', '', $str);
// will print "foobarfubar"
echo $str;

So early in your request chain, you would do this:

// get the path part of the request
// remove all the characters you don't want
$str = preg_replace('/(%20|\s|\+)/', '', $str);
// if anything changed
if ($str != $_SERVER['REQUEST_URI']) {
// return a 301
Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: http://example.com." . $str );

It's far more difficult using RewriteRule in your apache conf file because of the lack of a repetition operator.

1:26 am on Nov 18, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
votes: 0

Yes, perhaps the best you can do using RewriteRules is to use the RewriteRule [Next] function in a three-rule set:

# Detect and remove the first space from the requested URL-path, then save it and re-start
RewriteCond %{ENV:CleanedURLpath} =""
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^%?#\ ])\%(25)*20([^\ ?#]([?#][^\ ]*)?)\ HTTP/
RewriteRule ^. - [E=CleanedURLpath:%1%3,N]
# Detect and remove any subsequent space from the saved partially-corrected URL-path, then re-start
RewriteCond %{ENV:CleanedURLpath} ^([^%?#\ ])\%(25)*20([^\ ?#]([?#].*)?)$
RewriteRule ^. - [E=CleanedURLpath:%1%3,N]
# Once we get here, all spaces have been removed from the
# URL-path. Invoke an external redirect if any were removed.
RewriteCond %{ENV:CleanedURLpath} ^(.+)$
RewriteRule ^. http://www.example.com/%1 [R=301,L]

The allowance for "25" preceding "20" is to handle multiply-encoded spaces.

This is a fairly expensive rule-set; If using a more-specific pattern in the RewriteConds and RewriteRules is possible based on the "types" of the URLs that are being mis-linked and the nature of the linking errors, then I would recommend doing so.