Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Underscores Are Now Word Separators

per Matt Cutts

         

pageoneresults

3:41 pm on Jul 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One key development that Matt shared with the audience was that underscores in URLs are now (or at least very soon to be) treated as word separators by Google.

Underscores are now word separators, proclaims Google
[news.com.com...]

ericfwebmaster

7:42 pm on Jul 26, 2007 (gmt 0)

10+ Year Member



dashes-look-better

nomis5

7:49 pm on Jul 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've always used underscores and in Google search results the words separated by underscores have always appeared in bold if they were part of the search query. So I have asumed that Google understood underscores as separators for years and years. Otherwise why embolden the words separated by underscores. Beware of making any changes based on this post, dancing to the tune of the piper may lead you astray.

pageoneresults

8:05 pm on Jul 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Beware of making any changes based on this post, dancing to the tune of the piper may lead you astray.

I don't think anyone is making any changes. The whole purpose of this topic is to let Webmasters know that they "don't need to make the change" after all. But, up until that announcement was made, Google had treated underscores and hyphens differently.

g1smd

8:53 pm on Jul 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> words separated by underscores have always appeared in bold if they were part of the search query <<

That has been proved on many occasions to simply be a last-minute styling factor applied to the HTML page that is returned, and not in any way a reflection of any internal indexing or ranking factors.

danny

5:48 am on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Underscores sometimes become obscured in hyperlinks.

Yes, I've noticed a few of those (requests with %20s where there should be _s) in my logs - not a major problem, but still a problem.

My use of mixed-case is a worse problem here, though the requests for all lower case files all seem to come from automated software, not web browsers. (My hosting service doesn't have mod_speling enable, presumably because of the load.)

And no, I'm not Oprah, thankfully!

annej

4:23 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wish they would make up their mind. I'd used underscores for years until MC recommended dashes. I didn't change any old URLs but started using the dash on new pages which resulted in mixed _ and -. Looks terrible but I'm not about to change them now. What a pain.

mona

4:23 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



dashes-look-better

that_is_a_matter_of_opinion ;-) Any know why Google is making the change?

jomaxx

5:28 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know why they're making the change, or why now, but underscores are obviously used as word breaks so it's always mystified me why Google refused to see them that way.

The only exception I can think of is certain standard programming language variables and functions. And even then, while the underscore does have significance, it's also a word break; e.g. "HTTP_USER_AGENT".

tedster

5:33 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Look at Front Page extensions on a server - the underscore is not a word break, it's the first character of the folder name. There are many such examples that Google needs to accomodate.

pageoneresults

5:55 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the underscore is not a word break, it's the first character of the folder name.

It also marks a folder as private in FrontPage. Typically anything beginning with an underscore in FrontPage is a secure folder. For example, /_private/.

g1smd

6:51 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Matt Cutts has explained reasons for the "underscore as character" parsing some time ago.

It is to help people looking for programming code examples, so HTTP_HOST and such like, is treated as one word.

jomaxx

7:03 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



And that's not a bad reason, but to do it properly would require Google to handle all kind of special characters (dollar signs, parentheses, periods, etc.) differently from the meaning that they commonly signify in written language.

In fact, I wish they had an interface that DID do this properly, and that allowed us to search actual HTML source code and not just the text that is rendered by the browser. AFAIK no search engine currently supports this, although I seem to recall that Alta Vista used to, back in the olden days.

DamonHD

7:42 pm on Jul 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can Google's code search be made to search for regexes in .html files?

Tidal2

11:38 am on Jul 30, 2007 (gmt 0)

10+ Year Member



Any know why Google is making the change?

It opens up more options for domain names so its harder (or more expensive) to buy all the options and block the competition.

loudspeaker

12:35 pm on Jul 30, 2007 (gmt 0)

10+ Year Member



Finally! It only took them 10 years to realize that in a natural (written) language the dash has a syntactic meaning while the underscore has none, so the UNDERSCORE NOT DASH should have been the word separator all along.

Better late than never, I guess.

lavazza

11:55 pm on Jul 30, 2007 (gmt 0)

10+ Year Member



Can Google's code search be made to search for regexes in .html files?

Results 1 - 10 of about 11,500 for "[+-]?[0-9]*\\.[0-9]*[eE]?[+-]?[0-9]*". (0.11 seconds)

hXXp://www.google.com/search?hl=en&safe=off
&q=%22%5B%2B-%5D%3F%5B0-9%5D*%5C%5C.%5B0-9%5D*%5BeE%5D%3F%5B%2B-%5D%3F%5B0-9%5D*%22
&btnG=Search

g1smd

1:23 am on Jul 31, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> It opens up more options for domain names <<

Domain names cannot contain an underscore.

Miamacs

12:50 pm on Jul 31, 2007 (gmt 0)

10+ Year Member



Yeah but I saw a subdomain that did... I wonder why.

joelgreen

4:04 pm on Jul 31, 2007 (gmt 0)

10+ Year Member



What is the problem of treating both an underscore and a dash as word separators?

joelgreen

4:19 pm on Jul 31, 2007 (gmt 0)

10+ Year Member



Also why Google itself registered following domains with dashes rather than underscores? They changed their mind?

google-hosted-services.com
google-hostedservices.com
google-service-hosting.com

annej

5:09 pm on Jul 31, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It seems like early on we could register domains with underscores. Am I just imagining that?

g1smd

6:55 pm on Jul 31, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have never seen underscores in domain names.

joelgreen

8:26 am on Aug 1, 2007 (gmt 0)

10+ Year Member



My fault. Underscores are not allowed in domain names.

Domain Names as supported in the Domain Name System must be less than 63 characters in total length, begin and end with a printable character, and can contain only letters, numbers and the hyphen character (the hyphen '-' must be in the middle somewhere). Underscores are not valid.

But is it good having only hyphens as allowed words separator for domain name and underscore as a word separator for remaining part of the url? Inconsistency.

AlchemyV

11:43 am on Aug 1, 2007 (gmt 0)

10+ Year Member



i just think about what my users are likely to type in when they search

AlchemyV

11:47 am on Aug 1, 2007 (gmt 0)

10+ Year Member



thinking about it, I have always done underscore for page filenames as they read better than dashes imho but if its spam then i don't think it really matters as spammers will use anything, but its great news to business seos like myself and spammers producing their numerous sites as they may get more credit.

pageoneresults

12:42 pm on Aug 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It appears that all four of the majors now, or have been, treating underscores as word separators.

Google
Yahoo!
MSN
Ask

Miamacs

1:40 pm on Aug 3, 2007 (gmt 0)

10+ Year Member



I still don't see them treating underscores as word separators in URLs though.

...

Quadrille

3:03 pm on Aug 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's an interesting development, but as suggested above, one that will cause confusion once the blue underline makes it the same as the dreaded '%20' in a URL.

But, fact is, in SEO terms, it really doesn't matter. As one of a couple of hundred factors, it counts for too little to matter in most cases.

This was neatly demonstrated by tedster's trial - he saw no appreciable benefit for either form, even when the underscore was deprecated. That doesn't mean it was secretly working, it means it really doesn't matter. In most cases, at least.

As well as working as well as anything for SEO purposes, the hyphen is still better for avoiding visitor confusion - thus helping accurate write-in URLs.

Miamacs

5:08 pm on Aug 3, 2007 (gmt 0)

10+ Year Member



Yeah, I don't think anyone here would want to switch back to using underscores, it's just that I'm cheering for those older non SEO'd URLs that I left untouched for years.

...

The ever entertaining slug race.

tedster

7:45 am on Aug 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Matt just blogged a clarification about dashes and underscores - looks like the reporting was not quite accurate:

If you read Stephan Spencer’s write-up, he says that underscores are the same as dashes to Google now, and I didn’t quite say that in the talk. I said that we had someone looking at that now. So I wouldn’t consider it a completely done deal at this point. But note that I also said if you’d already made your site with underscores, it probably wasn’t worth trying to migrate all your urls over to dashes. If you’re starting fresh, I’d still pick dashes.

[mattcutts.com...]

< discussion continues here: [webmasterworld.com...] >

This 60 message thread spans 2 pages: 60