The main use of underscores today is as word seperator in filenames, email adresses, variables in programming and everywhere else where for some reason the use of a "space" is either not possible or at least could cause problems.
In contrast the dash is used to join words not to seperate them.
Why search engines do it the other way round has always been beyond my understanding.
So for example there is a difference between this two urls for a picture:
One would show an "american football player" the other an "american-football player".
However Google does it just the other way as it is intended, using the underscore to join the words, and the dash to seperate them. Does not make sense to me.
"looks like the reporting was not quite accurate:"
I read it over on SEL and I believed it out of sheer laziness. I suppose lack of time isn't good for keeping up to date with SEO.
It seems to me that SE's were being logical. They originally followed writing standards and recognized hyphens (dashes), commas, semicolons, etc., as punctuation and not part of a word. But an underscore was not standard punctuation and therefore was considered as part of the word.
Google may be rectifying this, but who knows about all the other SEs? It's still best to stick with dashes.
There are so many web applications now that generate pages with underscores in the URL, it seems absolutely crazy not to recognise that underscores are word separation marks.
Did everyone here try allinanchor: before messing up their sites? Or any kind of search that'd be unique to their URLs?
The report wasn't inaccurate it said exactly what it meant to say that Google is looking into the possibility of underscores being a word separator. Hearing it I felt somewhat sad, that I could be moved by a promise like that, ( all the hobby sites we do with friends, all the sites I really like are using them ) and that was the first thing I did. Went over to Google and checked allinanchor: ...
And checked again after 4 days.
Then again after two weeks or so.
Nope, still not a word separator.
At least not in URLs it ain't.
Completely out of context, a site I saw ( gathering intelligence ) with major accessibility and dupe content problems, AND in the -950 zone redirected its perfectly OK most important page to a URL with an underscore the NEXT DAY the reports came in. Of course the old URL fell out of the index, and the new never got in.
SEOs, Webmasters the message of the day is:
Don't haste things, tread carefully...
News aren't there to tell you what you should do.
News are to confirm what you've been seeing.
|I read it over on SEL and I believed it out of sheer laziness. I suppose lack of time isn't good for keeping up to date with SEO. |
First of all, this story was from News.com. I covered it as SEL citing News.com and then emailed Matt Cutts for more information.
I was not lazy about this at all.
Even if G are getting to grips with '_' as a separator, that does not remove the real objection - '_' in a link is invisible if there's a blue underline in place.
Using '_' may be becoming acceptible; that doesn't mean it's necessarily sensible. unless you have a good 404 page ;)
It is about time that underscores are treated approximately the same as dashes even if they are still not considered to be equal. This treatment has already shown its return on my site. Traffic has increased 2 fold.
Jecasc I agree 100%. I never used dashes as separators because some words are hyphenated in the English language... not underscore joined.
But the underscore is NOT used in the English language as EITHER a separator OR a joiner, so that argument is not useful; logic suggests you should be calling for a space to be recognised.
Having seen %20 once too often, I doubt you'll get much support. M$ desktop programs allow that (even encourage it), and it drives me to drink.
The underscore really has no place in the English language - or on a screen - except as, er, an underscore; ie a method of underlining words left over from the days of the typewriter.
Underlining, on the web, by convention means a link. Again, I doubt you'll find much support in calling for that to change; waaay too late.
"Underlining" and underscore is NOT the same thing. This-is-not-sensible, but_this_is. It should be obvious why.
Google is not doing this because Google is incompetent and becuase of technical legacy stuff (and their laziness and/or unwillingness to take care of problems).
Google is *well known* for ignoring Web standards, using "headers" made out of <font> and <b> instead of <hx> on their own pages (please! are we supposed to think they know anything about SEO if they can't even use real headers themselves?!), aggressively caching everything to the point where it becomes unreliable and confusing (AdSense, YouTube, their index, PR, etc.), and just generally not having a clue about anything but tricking people into thinking they "do no evil" (nothing could be further from the actual truth).
Just a "small" thing like their Copyright typo on Google.com (the (C) character is stuck to the year) amazes me. They really have no interest in doing anything other than making more money. They just don't care. This is why they want these stupid dashes.
Google is the new Microsoft.
"I was not lazy about this at all. "
I called myself lazy for believing what I read without bothering to check the SERPs, not you.
But frankly, your post title "It's Not Just Google That Treats Underscores Like Dashes" already assumes Google treats underscores like dashes. In the same article, you also write "Now Google treats underscores the same way as hyphens." Compelling, but those statements are false.
Don't get me wrong, I respect your reporting - which is why I want to be able to trust everything you write.
[edited by: Halfdeck at 4:31 pm (utc) on Aug. 10, 2007]
|This-is-not-sensible, but_this_is. It should be obvious why. |
Do share ...
Yup, they were wrong but there was no reason for why I should not believe a source like News.com.
In any event, I am glad Matt clarified and I added corrections to the articles.
--The underscore really has no place in the English language--
No one's calling for underscores to be used in everyday language, just in internet addresses and other situations where spaces aren't easily available.
Ideally we could use spaces instead, but that just isn't going to happen, so the nearest human-readable substitute is the underscore.
If someone says "red_car" in an address, they clearly intend that as "red car" and not "redcar". If Google is interested in accuracy, they'd treat underscores as spaces. Apparently they're not interested in accuracy, however.
It's not a google thing, it's everyone on the web, and this was an issue before Google was invented.
Underscores are not, never have been and never will be a sensible way to separate words, especially in URLs.
For what it's worth, spaces are even worse (if that's possible). Hyphens are not ideal, but are a common sense compromise, as the English language alternative would be a space.
For obvious reasons. :) In fact it's all so obvious really, isn't it?
And that's my last word, you'll be pleased to hear ;)
My main issue as a developer/code junkie with the dashes is that a dash in programming language is almost always a math operator.
So when I see example1-example2 I read it as a variable named example1 being subtracted from example2.
A dash really isn't a delimiter in my opinion. Not when compared to the underscore especially. To me it is funny that a bunch of programmers decided that a math operator was a good text delimiter. Must make for some interesting code in places.
That being said I don't like the underscore as a delimiter in domain names... but for file names... the dash is just wrong. You bring that file name into a programming name space and it will start trying to do math operations on names.
[edited by: Demaestro at 7:30 pm (utc) on Aug. 10, 2007]
Don't separators date back to the cheesy usenet alternative to italics?
It's 2007 and I can't hardly believe so basic an issue (that potentially affects 99.99% of websites) hasn't been sorted out yet (once and for all).
I never use hyphens or underscores for separation unless I have to and that's about never. I assume Google's engine has enough experience by now to parse joinedwords.
(It always seems to put searched words in bold for both the page title and URL, so I take that as a clue it's on the ball and doesn't get confused easily.)
The bolding we see is a simple character match routine, applied as a last step in creating the SERP's display. It's not an indicator that the algorithm recognized and used the bolded letters as a ranking element.
Underscores like red_car reminds me too much of PHP variables. $this->red_car->damaged_fender->repair_cost= 400.
During the early DOS era the underscore rose to prominence as it was commonly used as a word separator in file naming. There was a group of keyboard characters that could not be used. Not sure if my memory is accurate on this but I seem to recall that the hyphen was one that could not be used in DOS file naming?
Given the amount of documentation still in existence that dates back to DOS origins, it would not be a surprise if the search engines treated underscores in file names as they were originally intended.... as word separators.
Hyphens on the other hand link two associated words together (often unnecessarily)... the opposite of being a separator.
When it comes to establishing the theme and relevance of a web page, its logical to associate widgets_red and widgets-red as both being about red widgets. Both should be included in the results for a matching search.
jecasc - you are absolutely right! I am also appalled at the lunacy of whoever programs search engines. Perhaps the fact that programmers think the underscore JOINS words (in variable names) pre-determined their choice. These people shouldn't have been let out of the cage let alone be put in a position to decide the fate of all of the world's web sites..
N-e-t-m-e-g i-s r_i_g_h_t t-h-i-s i_s r-i-d-i-c-u-l-o-u-s.
1 w_o_r_d a-n-d m_i_l_l_i_o_n_s o-f c_u_t_l_e_t_s c-h-a-n-g-e p_r_o_c_e_d_u_r_e.
|I assume Google's engine has enough experience by now to parse joinedwords. |
You would be mistaken. Look at this thread: they still haven't figured out underscores between words. They're light years away from unjoining word clusters.
<tangent> which makes me wonder why it's so uncool in the webmaster community to use hyphenated domain names. maybe in 20 years time, all domains will be hyphenated, and we'll look back at our silly lotsofwordspackedtogether.com domains and laugh. </tangent>
I believe the underscore actually rose to popularity because of mailboxes. You could not use hyphens because of dos non-compliance but you could use underscores. This was the way that people could name mailboxes with full names such as John_Smith@ mailbox.net. I think it spread thru the web this way.
Ok, I'll date myself here. Austtr is correct. The underscore (_) became defacto standard for naming files in DOS and early Windows.
Hyphens (-) were allowed but it was uncool to name files with them because (in my opinion) they didn't seperate/distinguish each of the words prominently enough, so the underscore meme won.
I recall adopting the underscore approach in the early 90's when I moved from the Midwest to Silicon Valley (maybe they don't call it that anymore). I felt like I had learned the *real* naming convention used by the real professionals, and how it beat the mishmash of dashes and scrunchy words.
I think Windows first started allowing files to be named with a space around 95, plus or minus. But even today, I use an underscore for most of my file names. (strong meme).
Does google recognize the underscore? no.
Should Google recognize underscores? Should God save the polar bear?
[edited by: tedster at 9:15 pm (utc) on Aug. 11, 2007]