Forum Moderators: phranque

Message Too Old, No Replies

Redirecting URLs with underscores to URLs with dashes

         

carfac

7:47 pm on Sep 28, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So I was thinking to maybe update my site from us9ing Underscores to dashes in urls. I guess googles like that better....

ANyway, I have an OLD site, and a LOT of deep links to the old structure. My CMS built everything in the directory structure with underscores... so all of my hundreds of back links are underscores. So I want to 301 automatically ALL underscores into dashes. (I can rebuild everything easily with the dashes ion my end). I just am overwhelmed with the mod-rewrite syntax.

My existing structure is:

mysite.com/subdirectory/This_Directory_Name/Sub_Directory/Could_Be_Third_Level/Maybe_Even_4/

and I wold like to rewrite that 301 to:


mysite.com/subdirectory/This-Directory-Name/Sub-Directory/Could-Be-Third-Level/Maybe-Even-4/

There wil also be these links:

mysite.com/anotherdirectory/151826-Name_Of_The_Widget_in_Question.html

which just needs to be

mysite.com/anotherdirectory/151826-Name-Of-The-Widget-in-Question.html

So does mod_rewrite do a universal substitution? There will be no more underscores, so they can ALL be dashes.... but there will be many substitutions per line....

THANKS!

DAve

phranque

8:47 pm on Sep 28, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if the url is "public-facing" then you are asking about a redirect, not a rewrite, which is internal to your web server.
i would question the value of making this change for the underscores/dashes alone.
however, once you have made the decision to universally change urls, i wouldn't stop at changing the word separators.
you should also drop the .html file extension and fold all alphabetic characters to lower case.
handling the case and word separator redirect would best be done in a cgi or php script.
internally rewrite all requests with uppercase or underscores to this script and respond with a 301 status code and a Location: header referring to the corrected url.

carfac

1:31 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks phranque- not sure if I understand half of what you said....

Yes, this is front facing.... if that means the pages are on the Internets, that is where they are...

I thank you for pointing out all the other changes. If one is going to make s many changes, I guess it would be best to d it all at once!

dropping .html will not be a problem, I ma just do that now. It is not used

I will have to think about lower case, the case is sensitive right now in my CMS....

How important is all of this- case, html and dashes- to google?

I have a long running site that has rated well in Goog for years... i have been online since 1996. I have thousands of back links from Wikipedia (they even have a cool short cut for linking to my site). I figger it would be fine to make the change and just have the rewriterules as 301'
s in my apache htconfig file....

So that is my question- I could write a substitute script in perl or php no prob.... but how d I do a subsitute "_" for "-" in rewrite for apache config file?

carfac

1:38 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



See, I know I can do something like:

RewriteRule ^([^_]*)_([^_]*)$ /$1-$2 [L,R=301]

but the problem is that only changes 2 of them.... and some have 3, 4 or maybe even 15 levels of change.... thats why I keep saying subsititution, I need to potentially change a bunch of instances even within a single url...

not2easy

2:14 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



How important is all of this- case, html and dashes- to google?
Not very, not nearly as important as page speed and mobile performance from what I see. It was big news around 8 or 10 years ago that G "understood" dashes, treated them as spaces for reading URLs and preferred pages using dashes to runonpagenames and I know I've read that they "liked" dash (compared to underscores), but it was so minor and such a long time ago, I sure wouldn't start making the changes you mention unless it was desperate, especially if your CMS is expecting something else. There is some more on this topic, but I need to find specifics to do you any good. This is a rusty topic for me, sorry. I'll do some digging to see what may be useful.

smallcompany

2:35 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Study this:

[webmasterworld.com...]

If Jim was not able to provide more of the universal approach, I doubt anyone can.

Cheers!

not2easy

3:35 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I did find the old info, it's from Matt's blog: [googlewebmastercentral.blogspot.com...]
A snippet from that article (early 2008):
Currently, dashes in URLs are consistently treated as separators while underscores are not. Keep in mind our technology is constantly improving, so this distinction between underscores and dashes may decrease over time. Even without punctuation, there's a good chance we'll be able to figure out that bigleopard.html is about a "big leopard" and not a "bigle opard." While using separators is a good practice, it's likely unnecessary to place a high priority on changing your existing URLs just to convert underscores to dashes.

lucy24

8:15 am on Sep 29, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When you need to replace more than two or three of the same thing-- per URL-- it is not worth trying to do it in config/htaccess unless your name is jdMorgan. So the mod_rewrite part is exceedingly simple. It simply says, conditionlessly and without anchors,

RewriteRule _ /fixup.php [L]


i.e. if the page contains a lowline anywhere in its name, rewrite internally to a page containing your script. Because it's a rewrite, the server still "remembers" the requested URI and you don't even need to capture anything. (Uh... That is right, isn't it? What I know about php would fit inside a fairly small acorn.)

The great advantage of php -- or other language of your choice -- is that it can use Regular Expressions. So you just say, in a single line, "replace all occurrences of _ with -" and similarly "make all letters lower-case" and so on. At the end of this extremely short script, the php-or-whatever will issue a 301 redirect containing the new URL information.

Don't be fooled by the [L] flag. Since the internal rewrite ends up issuing an external redirect, position it among your other redirects.

That's assuming you want to do this at all. Honestly it's hardly worth the trouble unless you've decided that lowlines give you the fantods and it's time to be rid of them.

carfac

12:35 pm on Sep 30, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks. I think I will probably just not do this.

slipkid

9:18 pm on Sep 30, 2014 (gmt 0)

10+ Year Member



A couple of years ago I changed all my pages from low lines to dashes. Made absolutely no difference in the SERPs.