Forum Moderators: phranque
1. I want to replace underscores in my URLs in the folders and files.
e.g.
http://www.example.com/top_news/finance_economy.html
to
http://www.example.com/top-news/finance-economy.html
I do not know how many underscores there are in the folder names and file names. It varies from one location to another one.
Thanks
nico
[edited by: encyclo at 11:54 pm (utc) on Nov. 13, 2007]
[edit reason] switched to example.com [/edit]
In either case, you need some method of generating a permanently moved (301) header to tell search engines to always look for the old page elsewhere. There are resources on this forum to lead you to methods of doing this.
Although the recommendation is to use dashes, a recent thread here indicates G is now looking at underscores as spaces, and the keywords will still get picked up.
The other problem *I* have found involves using dashes in a URL to direct to a dynamic script using mod_rewrite. If you have, say, a mod_rewrite rule directing this
my_widget_objects_-_large
to something **like** this
scriptname?product_id=1234
The problem becomes converting the underscores to hyphens. If the original title has a hyphen
my widget objects - large
the URL would look like this
my-widget-objects---large
and you make the presumption that you will convert the dashes of any incoming URL to spaces, the now three spaces will never match on the original title.
Haven't cracked this nut yet, but it's another dimension to look at.
When a browser or client requests one of these new links, your server must then change the URL back to the underscored form in order to look up the correct 'page data'. This can be done in mod_rewrite, but you might consider using your script to do this function as well.
Finally, you should permanently redirect requests for the old underscored URLs to the new hyphenated ones. Again, your script is probably the most effective place to do this; Replacing multiple character occurrences and replacing a variable number of character occurrences are both highly inefficient when using mod_rewrite because mod_rewrite is not really a scripting language and does not directly support any loop constructs. Also, mod_rewrite has no way to check your database to help make decisions. Because of these factors, modifying your script to generate the redirects would likely be easier and more efficient.
Jim
[edited by: jdMorgan at 3:41 pm (utc) on Nov. 14, 2007]
This is easy to do.
When a browser or client requests one of these new links, your server must then change the URL back to the underscored form in order to look up the correct 'page data'. This can be done in mod_rewrite, but you might consider using your script to do this function as well.
I don't see how I can do that in the code as the pages with hyphens do not exist on my server. Can you please give me a code in order to allow me to do that? I need to be able to change underscores to hyphens in folders, sub-folders and/or file names.
Finally, you should permanently redirect requests for the old underscored URLs to the new hyphenated ones. Again, your script is probably the most effective place to do this; Replacing multiple character occurrences and replacing a variable number of character occurrences are both highly inefficient when using mod_rewrite because mod_rewrite is not really a scripting language and does not directly support any loop constructs. Also, mod_rewrite has no way to check your database to help make decisions. Because of these factors, modifying your script to generate the redirects would likely be easier and more efficient.
This is easy too but then I fall into a permanent loop problem (you're helping me on that in another discussion.
Thanks a lot
I don't see how I can do that in the code as the pages with hyphens do not exist on my server.
Sorry, I assumed your .asp 'pages' were dynamically-generated -- that is, all handled by one .asp script.
If this is not the case, then you will indeed need to rewrite the URLs. And that is not good, because the process is extremely inefficient and may slow your site down noticeably, necessitating an early move to a VPS or dedicated server (where you can use httpd.conf to do this a bit more efficiently).
Can you please give me a code in order to allow me to do that? I need to be able to change underscores to hyphens in folders, sub-folders and/or file names.
Our forum policy is that we do not provide a code-writing service. There are not enough 'helpers' here, and far too many requests to make that a viable option for this forum. Our chartered purpose is to help you write your own code by answering specific questions. However, if you search using most of the words in your thread's title, you will likely find several threads where this or a similar problem has been discussed, including code examples written by others.
For more information, see the documents cited in our forum charter [webmasterworld.com].
Jim
Here is the situation: I have a dynamic database driven site (php,mysql) where previous URLs were rewritten to be in format:
Some_Words_like_this.html
Lots of these links are currently present in google index and other search engines. I decided to go to a new format using hyphens so resultant rewrite URLs on the site look like this:
Some-Words-like-this.html
Okay, also some of the old links also might have punctuation in the URL like this:
Some_Words:_And_More,_Words.html (don't ask, I've just been called in to clean this mess up)
What I've got so far in .htaccess successfully cleans up the underscores and converts to hyphen for all the old style incoming links then does the proper 301 to new URL format. But I also want to remove any ":" or "," in the old style incoming links and haven't quite got it.
Here is what I have which handles the underscores alone:
# Ignore image paths
RewriteCond %{REQUEST_URI}!^(.*)\.(jpg¦gif)$ [NC]
#
RewriteCond %{REQUEST_URI} ^(.*)\_(.*)$
RewriteRule ^.*$ %1-%2 [E=space_replacer:%1-%2]
RewriteCond %{ENV:space_replacer}!^$
RewriteCond %{ENV:space_replacer}!^.*\ .*$
RewriteRule ^.*$ %{ENV:space_replacer} [L]
The above does the trick nicely for an old incoming link of Some_Words_like_this.html converting it to Some-Words-like-this.html
But for old style incoming link like this:
Some_Words:_and_some_more_text.html it removes the underscores replacing them with hyphens just fine but leaves the ":" intact so it ends up like this:
Some-Words:-and-some-more-text.html
I feel like I am close but can't figure how to get rid of stuff like ":" or a ","
Many thanks in advance to anyone who sees the light that I am missing and helps me keep what little hair I have left :)
Hooter,
Thanks for your code.
2 questions:
1. How would you go about redirecting permanently [301] all your files with underscores to those with hyphens in the same .htaccess?
2. how would you make your code work for files located in any of those three locations (the names of the folders can change).
http://www.example.com/news/top_news
http://www.example.com/news/top_news/top_stories
http://www.example.com/news/top_news/top_stories/my_file.asp
Thanks
nico
[edited by: jdMorgan at 2:16 pm (utc) on Nov. 15, 2007]
[edit reason] example.com [/edit]