Forum Moderators: Robert Charlton & goodroi
One key development that Matt shared with the audience was that underscores in URLs are now (or at least very soon to be) treated as word separators by Google.
Underscores are now word separators, proclaims Google
[news.com.com...]
Having said that, I've had the impression for some time now that Google already understands this on some levels. Same with run-together words (like WebmasterWorld). I'm not sure if it's related to anchor text or word density or what, but it's a feeling I get.
Koan, MSN and Yahoo are totally different from each other and Google - they all weight the hundreds of factors differently, however the term "google bombing" meaning to get lots of inbound links with a particular phrase to rank the destination web page that can not feature any of those terms may back up my point. That was just my half hearted joke in that respect. With regard to MSN & Yahoo looking at on page factors more, they have to because at the moment they are unable to create a deep enough and fresh enough index to be able to develop good link maps. That is the reason they are behind in the short term.
[edited by: tedster at 12:28 am (utc) on July 26, 2007]
And now if they can use actual good signals to rank web pages then we might be onto something - oh I don't know something like looking at the content of the page vs links.
They already are (and have been for a long time).
Underscores are not universally recognised as spaces by all search engines - only Google has come out and said this is in effect a new development. Use A-Z, 0-9 and hyphens just like a domain name - it is safe, and also gives you maximum exposure.
[edited by: Swanson at 1:22 am (utc) on July 26, 2007]
I was lucky enough to attend this same conference (WordCamp 2007,) and just as an FYI––while Matt did indeed say that underscores would soon be accepted as word separators, he also prefaced this entire segment of his presentation (which was on White Hat SEO) by stating that dashes were the best method for those who wanted to do well in Google's search.
I just thought that should be clarified, as it wasn't at all obvious from reading the article referenced above.
Multiple hyphens and long urls etc. are butt ugly. G should whack every url longer than c. three words, w/ or without hyphens!
Nice that WW doesn't have all the spammy urls like all the ugly blogs. If you want 15 words in your url, why stop there and not go for 100?
p/g
Honestly I'd be happy if file names didn't matter. It's content that matters to viewers .. not what you name your page or your choice of directory structure.
I think filenames do have a role to play in the relevance game. If I write an article with the filename "one-eyed-cats.htm," the odds are pretty good that the article is about one-eyed cats. That doesn't mean it deserves the #1 SERP position for "one-eyed cats," but it ought to be listed somewhere in the results for a search on that keyphrase.
I think filenames do have a role to play in the relevance game. If I write an article with the filename "one-eyed-cats.htm," the odds are pretty good that the article is about one-eyed cats. That doesn't mean it deserves the #1 SERP position for "one-eyed cats," but it ought to be listed somewhere in the results for a search on that keyphrase.
I agree but I don't think there should be that much difference in "one-eyed-cats.htm" and "one_eyed_cats.htm". Both articles are obviously should be about one eyed cats. I've always used underscores and will continue to use them. I don't see this as a big deal.
On another site, we continually build biography pages for important figures in that profession. Armed with my previous experimnet, I did not go back and change the legacy urls from first_last.htm, but I did begin to name new bios as first-last.htm. Again, there was no stand-out winner here, and it's now several years down the road. Both types of pages are still performing very well, and sometimes an underscore page ranks above the professional's own dedicated website. There are several hundreds of these bio pages on the site right now, so I'd say I'm looking at a significant amount of data.
My old-school way of thinking about the Google algo was like a scorecard. My black-box model was as if Google totalled up points for this and points for that -- and the most points would win. But I don't think the algo works that way any more. Google is getting very "neural" these days, working towards AI and fuzzy logic -- sometimes too fuzzy, perhaps.
A better analogy for keyword-in-url might be how well a site's signals let Google focus their lens, their "relevance lens". In this approach, I see keywords in the url just as a kind of reinforcing factor. They can confirm the sharpness of the focus, but I don't see those keywords as creating an independent plus. Instead, they are one potential reinforcement for what the rest of the algo determines about relevance.
It's even possible that a keyword-in-url that is off-topic for a long tail search might work against some rankings that the page used to get. It might be telling the algo, in essence, "I cannot confirm that relevance score from my angle - back it off just a bit."
Right.
...
I think you forgot an important aspect though.
A few million natural links to normal sites with copy pasted URLs as their anchor text... with /whatever_keywords_described_the_page(.html) for which they might get some additional recognition from the algo.
And thus stop ranking for "whatever_words_described_the_page" in exchange for the words and phrases themselves.
...
Keywords in URLs can account for some(times the majority) of your IBL anchor text so don't dismiss the idea just yet.
Although underscore is not yet a word separator, I just checked.
Also, there's no guarantee that Google would treat underscores any differently as it does now when it encounters them as text, when it encounters them as anchor text, and when it sees that they're in fact a copy pasted URL.
...
[edited by: Miamacs at 11:26 am (utc) on July 26, 2007]
I agree but I don't think there should be that much difference in "one-eyed-cats.htm" and "one_eyed_cats.htm".
I don't think there's much difference in practical terms. (I've got some underscored_filename_pages that rank #1 for extremely competitive keyphrases, so I'm inclined to believe that other factors--or the combination of other factors--has been more important than the word separator in the filenames.)