Forum Moderators: phranque

Message Too Old, No Replies

Redirect URL with White Spaces at End?

         

Planet13

8:05 pm on Jan 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi there, Everyone:

someone linked to my page with a bad URL; their link has a period and two spaces at the end of their href tag. So their code looks something like this:


href="http://www.mydomain.com/mypage.html. "


People following that link get a 404 error.

I can't contact the site to ask them to change the link, so I need to do a redirect / rewrite.

The problem isn't the . that appears after the .html

Instead it is the white spaces that appear after .html.

Thanks in advance.

(Yes, I DID try to contact the site owner and I can't get a response - I don't think they speak English, either).

Planet13

8:16 pm on Jan 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, well this worked (but I don't know if it is "right" or not...)

I put the file path with the spaces in quotes like this

Redirect 301 "/mypage.html. " http://www.mydomain.com/mypage.html


And it seems to work ok.

Hopefully, I haven't created a BIGGER problem somewhere down the line.

g1smd

8:16 pm on Jan 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, a simple RewriteRule can fix this.

You might want to also consider redirecting trailing comma etc, in fact anything that an auto-linking function on a forum or blog might accidentally include.

Use RewriteRule with pattern matching for more control. Especially avoid using Redirect or RedirectMatch if your site already has, or ever will have, one or more RewriteRules anywhere else in the configuration.

lucy24

9:51 pm on Jan 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hee. I know that one. One some Forums, the moderators spend half their time fixing links so they don't end in a spurious punctuation mark. Never met a site that felt it was their responsibility to fix it from their end, though ;)

URLs ending in an extension like .html are the easy ones, because you can mop them up globally:

([^.]+\.html).+

redirects to

$1

That is, if there's any stuff whatsoever after the extension (other than a query string, which wouldn't count in mod_rewrite), chop it off.

Be VERY VERY careful with spaces, because they tend to have syntactic meaning and may need to be escaped.

:: thinking about the less-than-ideal variations ::

If the spurious spaces and/or punctuation marks come at the end of the query string, add something to your query-handling routine to strip them off.

If the original link said something like "check out www.example.com!" would it reach your site or would the DNS say "never heard of 'em"?

How about "try www.example.com/, it's great!" You obviously can't say
([^.]+\.html|/).+
because all kinds of things could legitimately follow a /

If you have a blanket redirect looking for
[^\w/]$
(replacing \w with whatever form makes your server happy)
would you get false positives?