Welcome to WebmasterWorld Guest from 54.158.35.222

Message Too Old, No Replies

Comma as url word separator?

     
10:12 pm on Dec 14, 2010 (gmt 0)

5+ Year Member



Hi guys i am seeing it more and more...

url like this: site/cat/key1,key2

how "," is treated by google?
12:32 am on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I would guess it is treated the same as hyphen or period.

As ever, always avoid spaces and underscores in URLs.
12:52 pm on Dec 15, 2010 (gmt 0)



I would avoid using it, as there will be nothing but potential issues and no benefits
1:56 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



welcome to webmasterworld HeadlessChicken!

listen to the headless chicken (something i never thought i would say :))

keeps things simple and avoid introducing making changes that don't bring benefit.
3:49 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



As ever, always avoid spaces and underscores in URLs

Nothing wrong with underscores. I have a number of #1 positions with underscores.
4:46 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It's not that underscores are "wrong", but they are a weaker choice. Google did improve their ability to find keywords separated by underscores in recent times, and the power of having a keyword in the file path also seems to be lower these days.

The challenge is that search engines need to recognize the underscore as an actual text character. In fact a search for _ returns over 2 billion results, whereas a search for - returns zero results.
5:24 pm on Dec 15, 2010 (gmt 0)

5+ Year Member




and the power of having a keyword in the file path also seems to be lower these days


i don't think so... when I see a serp showing the same website in the first 3-4 results 90% of the time those pages got the keyword in url

i seeing again website opening a subdomain only for keyword (key1.website.com, key2.website.com etc)
5:55 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'm only talking about the file-path part of the URL, not the domain name or subdomain.
8:56 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I used commas extensively back in 2004 on a few of my websites. I've had no problems with them whether they are /key1,key2/ or /key1,key2.html

Those pages still rank, but they are now six years old and have backlinks.

If you are looking to find out about SEO-worth of commas, there is none. Hyphens themselves do not give you an 'edge', they just reduce your problems. Commas can produce problems so should be avoided.
10:08 pm on Dec 15, 2010 (gmt 0)

5+ Year Member



true but i need both hyphen and comma.. if i only needed 1 separator i would have used Hyphens for sure
10:12 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Never heard of someone "needing" commas. Can you explain further? This could be a PHP/MYSQL question rather than an SEO one. ;)
10:32 pm on Dec 15, 2010 (gmt 0)

5+ Year Member



Take a look at the file name of a Google Analytics report. What do you see? I see hyphens, underscores, and parentheses in our reports. ymmv
10:33 pm on Dec 15, 2010 (gmt 0)

WebmasterWorld Senior Member jab_creations is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.


http://www.faqs.org/rfcs/rfc1738.html [faqs.org]

Personally I stick with ampersands as separators and I try to keep my URL's as clean and alpha-numeric as possible with the exception of hyphens.

- John
12:46 am on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



It's not that underscores are "wrong", but they are a weaker choice. Google did improve their ability to find keywords separated by underscores in recent times, and the power of having a keyword in the file path also seems to be lower these days.

The challenge is that search engines need to recognize the underscore as an actual text character. In fact a search for _ returns over 2 billion results, whereas a search for - returns zero results.


But it is not a reason why someone should "As ever, always avoid spaces and underscores in URLs" it gives an impression that it is a confirmed no-no. This is the kind of thing that starts FUD.

As for using a comma, I would personally avoid using them. I have no data to confirm if it is a good or a bad thing, but wouldn't use them. I've no need.

And as for a weaker choice... how can I improve position #1 for using underscores ?
5:16 am on Dec 16, 2010 (gmt 0)

5+ Year Member



I prefer to use underscores, they look cleaner and are the only character that is never used as part of any word or sentence, so there is no ambiguity. I don't think that its a weaker choice anymore, too many big content sites use it.
5:51 am on Dec 16, 2010 (gmt 0)

10+ Year Member



I have used underscores way back and sometimes wish I hadn't. They create a challenge when either repeating a URL verbally or it slows you down trying to type it.

Commas sound like just another confusing thing to avoid. Not clear visually or verbally again.

I think keywords still have value in the URL path like the name of a directory or filename. Adds more relevance when a user sees it in the results or link as well (as long as it isn't overboard of course).
6:15 am on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There are many technical "words" that use underscores, both as a first character and as an internal character. It is this kind of usage that requires Google to index an underscore as an actual character rather than a mere separator. So there is plenty of ambiguity around that particular glyph,
8:27 am on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The problem with spaces is that they must be encoded to %20 and that makes%20URLs%20using%20spaces%20unreadable. It also presents coding challenges when using URL rewriting.

One problem with underscores is that the_underscore_disappears_in_underlined_links, which often leads to verbal miscommunication of URLs.
8:59 am on Dec 16, 2010 (gmt 0)

10+ Year Member



I prefer breaking the final keywords into subdirs, even if it means there will only be 1 page in that dir.
9:37 am on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



>> In fact a search for _ returns over 2 billion results, whereas a search for - returns zero results.

well a search for , returns zero results too.
10:09 am on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



One problem with underscores is that the_underscore_disappears_in_underlined_links, which often leads to verbal miscommunication of URLs.

True, but... only if the web developer uses underlined links. But from a SERP POV, I haven't found it a problem.

I used underscores years ago - long before I knew about SEO. I wasn't going to alter them because the pages were already established. I had a mix of URL's using either hypens or underscores.

Now, if I was making that site today, I would use hypens over underscores just because the look better. I would also use all lowercase, unlike in the past.
10:37 am on Dec 16, 2010 (gmt 0)

5+ Year Member



I sometimes use "+" as a separator. It's pretty clean as it decodes to a space. Just got handed a project that uses comma's as separators. Doesn't look pretty I think but it's probably the least of the issues with that site.
2:36 pm on Dec 16, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> But it is not a reason why someone should "As ever, always avoid spaces and underscores in URLs" it gives an impression that it is a confirmed no-no. This is the kind of thing that starts FUD.

That should be seen as a "rule of thumb," not as a "universal law" -- but certainly not as FUD.

When using non-alphanumeric characters in URL-paths, several factors must be considered:

  • Technical requirements: Observe the character-usage restrictions of the HTTP protocol -- See RFC2396 - Uniform Resource Identifiers (URI): Generic Syntax [faqs.org], which specifies which characters can appear in each part of a URL, and which must be URL-encoded if they are to appear in part of a URL. Note that the requirements differ for each part of a URL - for example, the query string character-set is somewhat less restricted than that of the URL-path-part.

  • Search Term Matching and word separators: Observe that some characters, most notably the underscore, are treated differently when search engines are matching user search terms to indexed pages. Some are considered as word separators, while others are not. Google's old policy was that an underscore was treated as a word-letter (as in the PERL "\w" regular-expressions token), and searches for strings containing underscores would return only results for URLs, titles, and page-content actually containing underscores. The underscore was not considered as a word separator. Recently, Google has "relaxed" this policy a little -- but compare searches for "mod actions" [google.com] which return first-page results focused on forum moderator actions such as editing posts, to the results of searches for "mod_actions" [google.com] where the first-page results focus solely on the Apache module of that name. Disclaimer: U.S. google.com search results, not logged in.

  • Link Transcription and verbal citation: Underscores can "hide" under the underlining of HTML links (as in the second of the two links cited directly above) and therefore appear as spaces. While you may have a policy to use only non-underlined links on your own site, consider that other sites linking to your pages may have no such policy. As noted above, this makes link transcription unreliable, as the person writing down or speaking the link may see and read that underscore as a space if the link is underlined.

  • Link Presentation: Characters which are required to be encoded (RFC 2396) look pretty awful in search results, and again are very difficult to read and/or to 'speak' correctly. While I acknowledge the counter-example of Google Analytics links above, do bear in mind that Google is probably NOT making any effort to get those Analytics pages well-ranked in search results... their own or any others.

    These last two items should be also carefully considered in the light of any possibility that your site might attract a media citation -- It would do you very little good to have your site mentioned on the news if the radio announcer or TV graphic-overlay technician mis-read your URL. That would constitute a rather considerable lost opportunity.

    So all-in-all, I have no problem advising anyone who will listen to use only lowercase letters, numbers, and hyphens in URL-keywords -- Call that FUD if you like, but consider the HTTP protocol requirements, search term matching, link transcription, and link presentation, too. In that light, it is simply a very good rule of thumb.

    Jim
  • 3:15 pm on Dec 16, 2010 (gmt 0)

    WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



    but certainly not as FUD.

    I said this is how FUD starts - ie. Someone stating something as fact, when results prove the opposite.

    It would do you very little good to have your site mentioned on the news if the radio announcer or TV graphic-overlay technician mis-read your URL. That would constitute a rather considerable lost opportunity.

    My site has been mention in the News, Papers etc. No problem yet :)

    What I have seen as a problem is when a site has linked to a deep page, and because some of my URL's are www/example.com/Page.html their program converts it to www/example.com/page.html and because I am on an Apache server, it will create a 403 error.

    But I have not had any problems with caching, SERPS or otherwise with underscores. Period.
    8:35 pm on Dec 16, 2010 (gmt 0)

    WebmasterWorld Senior Member demaestro is a WebmasterWorld Top Contributor of All Time 10+ Year Member



    The challenge is that search engines need to recognize the underscore as an actual text character. In fact a search for _ returns over 2 billion results, whereas a search for - returns zero results.


    I would suspect that has more to do with - "minus sign" being a built in "search operator" indicating that you wish to omit results that follow the char.....

    "return results for this string" - "omit results with this string"

    Similarly searching for + also returns no results. I suspect for the same reason.
    8:58 pm on Dec 16, 2010 (gmt 0)

    WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



    I was talking about extracting relevance cues from the file path of the URL, not from the query term itself.

    So I don't see the limitation to machine processing from the fact that a dash character can be an advanced search operator. There are many technical words that begin with an underscore, but none that begin with a dash. That means if the dash character is the first character in the word, then it's a Boolean search operator and not part of the query term.

    It is probably a bit challenging to know how to deal with an internal dash in the query term itself. Sometimes a dash is used in a compound word (e.g. "well-respected"), and Google seems quite willing to treat that dash as a space. Sometimes a dash is actually part of a single word (e.g. "re-invented") and Google just drops that and concatenates the two parts, with no space inserted.
    12:49 am on Dec 17, 2010 (gmt 0)

    WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



    Sometimes a dash is actually part of a single word (e.g. "re-invented") and Google just drops that and concatenates the two parts, with no space inserted.


    You would think so, but i've just tried it with a word that I know is sometimes hyphenated, sometimes not, or with a space.

    "Keyword" 1,030,000 results
    "Key-word" 1,620,000 results
    "key word" 1,730,000 results
    3:33 am on Dec 18, 2010 (gmt 0)

    10+ Year Member



    "Key-word" could easily be two different words, so the SE separating them makes sense.
    10:55 am on Dec 18, 2010 (gmt 0)

    WebmasterWorld Senior Member lame_wolf is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



    "Key-word" could easily be two different words, so the SE separating them makes sense.

    In my case it isn't. I know because I have to target all three of them.

    Not a word that I target, but take "playground" for example. In my lifetime, "playground" has also been spelt as "play-ground" and "play ground"

    playground 35,700,000 results
    play-ground 87,700,000 results
    play ground 18,100,000 results
    1:58 am on Dec 19, 2010 (gmt 0)

    WebmasterWorld Senior Member 5+ Year Member



    I don't know about commas, but I rarely use any separators in urls. I do, however, use keywords jumbled up together and for the most part Google seems to understand this (on my sites anyway).

    For example; www.example.com/redwidgetkeyword.html
     

    Featured Threads

    Hot Threads This Week

    Hot Threads This Month