Forum Moderators: phranque
RewriteEngine On
RewriteRule ^([^.]+\.html)\w+$ /$1 [R=301,L]
RewriteCond %{REQUEST_URI} ^([^.]+\.html)
RewriteRule \.html. http://www.example.com%1 [R=301,L]
Note the absence of a closing anchor: all you really need is ".html with more stuff after it" and you don't even need to specify what the more stuff is. RewriteCond %{REQUEST_URI} ^([^.]+\.htm)
RewriteRule \.htm. http://www.example.com%1 [R=301,L]
That's assuming you don't have mixed htm and html on the same site. (Ugh! What a mess!) If you did, the rule would have to be RewriteCond %{REQUEST_URI} ^([^.]+\.html?)[A-Z]
RewriteRule \.html?[A-Z] http://www.example.com%1 [R=301,L]
... And if the bad URLs don't always start in a capitalized word, then I wash my hands of you :) replacing \html by \htm
...the first version had all files ending with .htm (instead of .html)
I am flooded with bad referrer from hotels-in.xyz
66.249.64.126 - - [01/Mar/2015:20:39:52 -0800] "GET /ebooks/perez/Perez.htmlOnce HTTP/1.1" 404 1412 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
....
66.249.67.42 - - [05/Apr/2015:15:34:11 -0700] "GET /hovercraft/april_blues.htmlMr HTTP/1.1" 404 1412 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
I think that's the same kind of thing OP is describing.
Where does "Once" and "Mr" come from?
Maybe a bug in a version of a popular CMS/plugin?!
Is that the actual domain from which you are getting "incorrectly formed" traffic? Have you confirmed that there is nothing on your site that might have resulted in these malformed links? Otherwise it sounds like quite a fundamental error on their part which is likely to have resulted in a lot of corrupt outbound links? In this case I have these ones : hotels-in.xyz and top1hotel.com
Fatal error: Call to undefined function view_index() in /var/www/html/controller/index.php on line 26
RewriteRule ^([^.]+\.html)\w+$ /$1 [R=301,L] is there a way to use the code, but with a larger spectrum that would include special characters
RewriteRule ^([^.]+\.html). /$1 [R=301,L] Are you saying that this code would not work in this case : .html?abc