Forum Moderators: phranque

Message Too Old, No Replies

Rewriting malformed URLs in incoming links

.htaccess

         

geekay

6:56 pm on Nov 6, 2015 (gmt 0)

10+ Year Member



I am trying to rewrite malformed URLs in a few links to my site but all my rule experiments, like the one below, result in a 500 error.
RewriteRule ^(.*)\.html(\.|\)\.)$ $1.html [R=301, L]

The erroneous external links end in html. and in html). and I would like to rewrite them to html
But, maybe it is simply impossible to avoid influence from normal end-of-sentence-periods that are not linked.

lucy24

7:41 pm on Nov 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There's a simpler way.
RewriteCond %{REQUEST_URI} ^(.+\.html)
RewriteRule \.html. http://www.example.com/%1 [R=301,L]
The Condition and %1 (instead of direct $1) may seem redundant. It's to save your server the extra work of capturing when the request doesn't include something after the "html"-- or when the request wasn't for a page in the first place.

If your URLs never ever contain literal periods (that is, a period within the name of a file or directory), change .+ in the condition to [^.]+ for added efficiency. Periods in this situation are perfectly legal (witness apache dot org itself!), but if you already know that they can't occur, a lot of rules can be streamlined.

I intentionally said . rather than \. because you may as well make an all-encompassing rule that says "if there's any stuff whatsover after the .html extension".

:: detour to docs ::

If your quoted rule was a direct cut-and-paste, the 500 error is because you've got a space in the middle of your flags list. Spaces have syntactic meaning in mod_rewrite.

In any case, an external redirect should always start with the full protocol-plus-domain.

normal end-of-sentence-periods that are not linked

A common problem with automatically generated links. Your solution is a reasonable one.

geekay

10:29 am on Nov 7, 2015 (gmt 0)

10+ Year Member



Thank you lucy24 for both the code and the explanation. I hope your answer will help other webmasters, too, whose sites have been gifted with a highly valauable but, alas, malformed link.