Forum Moderators: open
I was just wondering of what is the differences between Hyphen and Underscore in the URL. Does "_" means space while "-" make the word become one. Because I used to check Keyword density of some of my competitors and I found that URL that use 'hyphen' will represent as one word..
Ex. www.domain-name.com/main-key-word.html
In this case, Keyword density analyzer will show that there are 2 words in this URL and 1 is keyword (main-key-word). So I just confuse about this and... what will it be if they use "main_key_word" instead. Anyone?
Does pagename *really* matter? I mean does google give that decent weight in its algo usually?
And I really feel that search engines should start considering underscores as spaces. Even respected directories like Dmoz and Yahoo are using underscores in their filenames.
Well, I have got around 3,000 pages indexed in Google which have underscores. Do you think I should change them to hyphens?
Also, what could be the reason that google doesn't consider underscores as separators? There are so mane sites out there with good content which are using underscores, and if Google started using underscores, too, in allinurl calculation, that would really improve relevancy, too.
Any thoughts!
Page name matters. Don't know how much.
{b]Quite a lot[/b], in my case. My site title used to have the phrase "redwidget consultant blue widget consultant etc" and it ranked #1 for "redwidget consultant". Since I specialise in "blue widgets", I decided to rearrange the title to "blue widget consultant redwidget consultant" and some of the page content.
Instead of holding #1 and #2 spots for "redwidget consultant", I am now #1 for "blue widget consultant". While that should please me, it is more obscure than the other specialty, so my hits have dropped dramatically. People still find me with the other phrase, but I have dropped to #9 and #99 for that phrase, with the business site being #99 and #9 being my personal page.
There could be other factors, but I have immediately set about to optimise for the original title.
- Ash
(My very first posting to webmasterworld - blush :)
I'd like to throw in my two pennies worth with regards to the '-' vs '_'.
I don't think that it's a Google feature. Rather the underscore is a normal letter by definition, while the hyphen is a token separator with regards to text indexing.
It is just that people without a background in IT tend to see the underscore as a separator because it appears to be a 'funny' character to them.
The difference is most important if you use your keywords as a file name and if a search engine like google picks them up, integrates them into their linkbase, but for some reason does not parse the content. So the content does not make it into the actual index. That's what Google calls 'partially indexed'. You would only find the urls of these pages in any serps.
Is the file name made up of more than one word, they should be separated by anything other than underscores. This would be the only chance that anybody picked them up if they were searching for these words.
Kosta
Yes. A search for "two-dimensional" yields the same results as "two dimensional" (each with quotation marks). Therefore, Google seems to make no difference between a hyphen and a space or even replaces hyphens. (A search for "two_dimensional", "twodimensional" two dimensional as well as two-dimensional give different results.)
You use %20 in files names? <fx>Shudder</fx>
Maybe it's just me but I've always thought that spaces in filenames on the web are just asking for trouble. Although I guess as long as you always encode a %20 in then there is no problem but if you rely on the browser to replace spaces with %20 in long URL strings for example then things can easily come unstuck, especially with older browsers.
Simon.
"word1 word2 word3"
"word1_word2_word3"
"word1-word2-word3"
"word1%20word2%20word3"
all had different SERPS.
I used both a travel destination that I am interested in and the phrase "harley davidson motorcycle" with and without dashes and underscores. (i assume this search term is sufficiently generic to post without violating tos.)
the official motorcycle site was number #1 for the results with spaces and with dashes, using the %20 resulted in zero matches and HD was not in the top 10 using underscores, however I did see alot of ODP and similars in the SERPs, likely due to their use of underscore in their taxonomies.
GG can you please change the algo for google to treat "_" the same as a " " because I am too lazy to change all my pages. :)
[edited by: PatrickDeese at 8:34 pm (utc) on April 28, 2003]
A search for "two-dimensional" yields the same results as "two dimensional" (each with quotation marks).
But the same search without the quotation marks does not :o
Why? Because the hyphen is converted to a space in the phrase search but there are thousands of hyphenated words so it must be considered in a word search?
As pointed out, "-" in the URL matches a space in the search string while "_" doesn't, but personally I wouldn't bother going back and changing URLs to match. The 'keyword in URL' affect is small; the time would be better spent adding content.
That has always been my intellectual belief, but I have had the nagging feeling that "every bit helps", so should I not change the old filenames? Glad to hear that ciml thinks it is not worth the time!
To make it short and sweet, what you meant to say was that a search engine would never use an underscore as a separator. Is that what you meant to say?
Yes, you hit the mark Imaster.
Whether or not there is a distinction made between hyphens and spaces depends much more on the specific configuration of the index engine's tokenizer.
There is a choice to treat them the same, in which case blue-widgets would create only two tokens.
If the analyzer translates your search input of blue-widgets into "blue widgets", then the actual blue-widgets will be in the serps, otherwise not, because there will be no blue-widgets in the token list.
If, on the other hand the hyphen would be treated as a regular letter, only blue-widgets would become part of the index and therefore only blue-widgets could be found. (This is just theoretical, no indexer does it this way.)
Most of the more advanced engines allow a mixture of these.
Typically there would be three token entries:
blue-widgets
blue
widgets
And wheter blue-widgets is in the serps when you actually searched for blue widgets or "blue widgets", whether tokenization works differently or the same on urls and full-text, that depends on the cleverness and philosphy of the implementors (and also very much on the hardware ressources at his hand).
Kosta
>well after vomiting, I cleaned myself up
Hope you feel better soon! ;)
Personally as a result of this thread I have started using hyphens instead of underscores in all new content. I've no intention of going back over old stuff, it's just not worth it, especially if you already get good SERPS on the existing pages. It'll probably take a couple of updates before we will see if there are any significant benefits.
Simon.
Even I am going to do the same. Carry on with the underscores, and use hyphens in the future developments.
GoogleGuy - If you are listening, please can you get underscores implemented in the algo. There are quite a good sites out there using underscores as the separator.
Thanks everyone who participated in the thread.
I also would agree that widgetmaker.com should be just as valuable in Google as widget-maker.com... assuming Google is smart enough to parse the words "widget" and "maker" out of the domain "widgetmaker".
In any case, I went back and changed all my urls and filenames. I only left redirects on a few of the most important pages. I'm sure I'll get a few 404 errors this month, although most people enter my site on the homepage.
Frontpage (don't laugh), creates new page titles automatically, using the underscore character in the filename (url) in place of any spaces in the page title.