Forum Moderators: phranque
I have tried this
#RewriteRule ^/(.*)\+(.*)$ [%{HTTP_HOST}...] [R=301,L]
like if I have charecters like ~!@#$% etc and 0-9 then they will be just removed.
like [sitename.com...]
will become
[sitename.com...]
also is there any good tutorial availabe, I have't found any good one.
RewriteCond ^([^\ ]*)\ (?)$ $1$2
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]
this also not working
Removing path parts is easy if they are always in a fixed position in the URL, or are easily matched to a pattern.
If the extra characters are random in their position, and are random characters themselves, then the problem becomes a lot more difficult... any .htaccess redirect code is likely to be very very inefficient - and you absolutely do not want to have 'chained' redirects - where you have one redirect after each single character fixed. For 10 fixes, you would have 10 chained redirects and the browser or user agent would likely give up after several redirects and NOT access the content at all.
In this case, you might be better to look at using a RewriteMap but be aware that your HOST will likely need to set that up for you.
I would strongly advise anyone contemplating the use of mod_rewrite to read the Apache mod_rewrite documentation [httpd.apache.org] thoroughly. Mod_rewrite is a powerful module which modifies your server configuration; One single typo can take down your server (if you are lucky), or it can quietly destroy your search engine rankings over time. As such, it cannot be treated casually, and cutting and pasting code you do not fully understand is a sure recipe for disaster.
Additional resources are available in our Apache Forum Charter, and examples can be found in our Apache Forum Library. Links to these resources appear at the top left of every page in this forum.
Jim
I can remove all the spaces from url with this
# Replace spaces with hyphens
RewriteRule ^([^\ ]*)\ (.*)$ $1$2 [E=rspace:yes,N]
# Redirect to update URL in search engine listings and browsers
RewriteCond %{ENV:rspace} yes
RewriteRule (.*) [%{HTTP_HOST}...] [R=301,L]
I can do the same with underscore by just replacing rspace with unscors
but what about the rest of the charecters like 0-9 and ~!@# etc
Move any URL with any character you don't like into a server variable.
Replace all instances of the first unwanted character in the server variable (multiple rewritecond/rewriterule steps as shown in that other thread, enough to allow for all instances you might expect to replace.
Then repeat with a set of rules for the next unwanted character, updating the same variable as in the first replacement ruleset.
Continue with a ruleset for each unwanted character.
When all unwanted characters have been replaced in the server variable, then and only then do the external redirect.
This will be a ton of code, and very, very slow and inefficient. I suggest that you re-evaluate exactly what is causing this problem, fix the root cause, and then "repair" only the critical URLs which have errors, letting the rest return a 404 Not Found error.
If you depend on such a far-reaching URL "repair" solution, you are likely to have problems in the future in addition to severe server-performance-related problems; For example, you will not be able to successfully set up a Google or Yahoo Webmaster Tools account, because doing so requires that you place a file on your server with 16 digits in its name, and the code we're talking about here would strip those numbers from the URL, and make it impossible to use that file to validate your site with Google.
Jim