Welcome to WebmasterWorld Guest from 54.166.227.36

Forum Moderators: open

Message Too Old, No Replies

Google recognizing word parts?

filesharing -> file sharing?

     
3:13 pm on Feb 9, 2003 (gmt 0)

10+ Year Member



When visiting the German Google page (www.google.de) and searching for "filesharing" I was quite surprised getting results for "filesharing" and "file sharing".

Has anybody seen something like this before? I didn't find an example on google.com. Is Google may be testing a new algorythm?

Or have I missed something?

--
globay

4:06 pm on Feb 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If google returns also results that use the two words "file sharing" in their title allthough you searched for "filesharing" that isn't a indicator for a new algo.

If you have some (or even just one) inbound link(s) with the anchor text "filesharing" allthough your title is "file sharing" that's enough to be found if someone searches for "filesharing" ...

6:41 pm on Feb 9, 2003 (gmt 0)

10+ Year Member



Yes, but on the Google result page, "filesharing" appears in bold as well as "file sharing"!

So I guess it's not only the inbound links.

--
globay

7:20 pm on Feb 9, 2003 (gmt 0)

WebmasterWorld Senior Member nffc is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Looks new to me, it may still be the links but I haven't seen it before.

Take a look at this cached page
[google.de...]

I searched for applemac, it shows that term and apple mac.

11:29 pm on Feb 9, 2003 (gmt 0)

10+ Year Member



I've seen similar examples in my market niche... Google is splitting the words. Big change?
1:19 am on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member rfgdxm1 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It does look like in some cases Google is splitting words now. My guess is that they are only doing this in special cases they have hard coded into the software. Thus, since basically always "filesharing" = "file sharing", they are treated as the same. However, rugrats won't be split into rug and rats.
1:24 am on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Could this be specific to Google.de to allow for the German habit of creating words by stringing together other words. An example from the page would be Momentaufnahmen (snapshot).
1:48 am on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member rfgdxm1 is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I had the same thought HarryM. From what others have told me this sort of combining words is very common in the German language. Google may have made a special tweak for google.de to take this into account.
1:53 am on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right... it doesn't seem to happen at google.com with "filesharing," (just get the "Did you mean: file sharing?" prompt) but clearly does at google.de.
7:19 am on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member vitaplease is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Nice find. Google also highlights the split "file sharing" in its cache.

Strange though, computersystem also works as split-up, but computerfile does not.

9:47 am on Feb 10, 2003 (gmt 0)

10+ Year Member



I think they are testing it for quite some time now. I've heard about it a few weeks ago, but I couldn't verify it back then. I'll have to take a look at our logs to check when it started.

It appears that there is no general scheme behind it. A search for 'websitedesign' finds 'website design' and 'web site design'. A search for 'searchengineoptimization' finds nothing but that term. But a search for 'searchengine' also finds 'search engine'.

Generally, it appears that only terms which are actually searched by users are splitted. If you merge words that don't make sense this way like 'domake' or 'haveget', you receive the "did you mean..." message but not the correct spelling in the SERP.

Since queries are splitted but not merged, it may have a major impact on SEO in german speaking countries. Now, splitting terms certainly has some advantages and having different pages for both versions like I always did it, is rather unfavorable because PR and anchor text of inbound links is split up between two pages. Looks like a little work to do...

11:24 am on Feb 10, 2003 (gmt 0)

10+ Year Member



Yuck, I noticed a lot of german speaking guys in here, and this certainly looks like a new challenge, but I think it's great as some of the domains I have to optimize use keywordandkeyword.at and it would be great if Google managed to separate those!
5:26 pm on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think google just splits words based on its spell check data. To follow the given examples:

- Filesharing is wrong - should be File Sharing
- Webdesign is wrong - should be Web design

If google thinks (or knows) that a searched word is misspelled (allthough there are results), it prob does another search for the correct spelling and then adds these results to the "misspelled set".

5:57 pm on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It seems to happen in Google.de only.

In Google.nl applemac stays applemac and filesharing filesharing, where in Google.de the seperated terms are highlighted as well.

(The Dutch combine existing words to form new ones.)

6:00 pm on Feb 10, 2003 (gmt 0)

10+ Year Member



google will ask
"did you mean file sharing"

so maybe the suggestion has something to with it...

6:43 pm on Feb 10, 2003 (gmt 0)

10+ Year Member



I agree with what rfgdxm1 said: it looks like Google hard coded these special cases where two words mean the same. Especially in the German language where "File-Sharing" is as correct as "Filesharing" this would improve search results.

But is this just the beginning of an advanced change in Google? Are they going to merge the results of similar words and their different ways of spelling, like "Optimization" and "Optimisation", since both ways are correct somewhere and certainly some are confused and spell it wrong.

Is Google going to recognize the parts of a domain name that are not separated with a hyphen? You can't use a space and there is no spelling rule that tells you to set a hyphen instead (and many don't do!). Is there a difference in quality of my-domain.com and mydomain.com? I don't think so. And isn't Google trying to list the results according to their relevance? Well I think there are going to be more changes sooner or later. At least there is a lot of potencial for improving!

What do you think?
--
globay

6:45 pm on Feb 10, 2003 (gmt 0)

10+ Year Member



By the way: it is interesting that Google suggests "Optimization" when you spell it with s, but it does not suggest anything when you spell it with z ;-)
7:40 pm on Feb 10, 2003 (gmt 0)

10+ Year Member



This is something I've wondered about for awhile.

Why can't Google derive at least some "word splits"
within a character string (such as a multiworddomainname.com)
from the link text part of a link that's visible to a viewer
or from text surrounding a link?

Maybe figure the ratio of occurance of "filesharing" to
"file sharing" in the link text and figuring the probability
that filesharing actually is two words, not one. Seems an
algo along this line would be fairly accurate. Especially
when filesharing isn't a dictionary word nor encountered
that often compared to "file" and "sharing" both being in
a dictionary and often appearing in order -- and when
related to the theme of the page.

If, within the link text, the same string of (non space)
letters appear -- but with spaces between some letters --
that might indicate a high probability that a string of
characters like "filesharing" in a domain name should really be
"file sharing" if that's what is in some (at least) link text,
surrounding text, or some other on page factor/occurance of
"file sharing".

Just some random thoughts I had awhile back when reading
the discussion here about and looking at the links to
emptywebsite.com as the #1 result for a search on Google
for "empty website". However, I'm too much of a newbie to
feel too confident in any such "insights/suspicions" I have
as I begin learning about SEO, thus didn't post about it on
that thread. Hope the above has some value to this discussion.

Take care,

Louis

9:40 pm on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the way: it is interesting that Google suggests "Optimization" when you spell it with s, but it does not suggest anything when you spell it with z

Not really surprising: "optimization" returns 3.4 millions hits; "optimisation" returns only 838,000. As far as the logic in the software is concerned, that makes it more likely that the latter is a spelling error, or at least that you'd want to look at results for the much more common term.

9:50 pm on Feb 10, 2003 (gmt 0)

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member



When I looked at this thread yesterday I couldn't see what globay was talking about. That's because I tried it with "file sharing" (no quotes).

It only seems to go one way.

For "file sharing" Google will pick up "file sharing" and "file-sharing." But for "filesharing" it picks up all three variations.

With the German language propensity to bung words together you would think Google would be consistent.

9:08 am on Feb 11, 2003 (gmt 0)

10+ Year Member



I don't even want to check on all the possibilities of DonauDampfSchiffFahrtsGesellschaftsKapitänsAnwärterAusbilder

:)

2:18 pm on Feb 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jpavery:
google will ask
"did you mean file sharing"

so maybe the suggestion has something to with it...

no, it still suggests the same when splitting up

JayC:

Not really surprising: "optimization" returns 3.4 millions hits; "optimisation" returns only 838,000.

sounds logical, but I've seen Google suggest words returning *less* results. Perhaps it checks for number of queries?

7:29 pm on Feb 11, 2003 (gmt 0)

10+ Year Member



I have paid special attention to the keyword "Citybike" for a while. It is english, yes, however, it is broadely used for urban bicycles in Germany and Switzerland also.

On my site I never ( never never ) spell it "City Bike". Now suddenly I rank Top for "City Bike" also and "Citybike" is highlighted in the search results. Definitely new feature.

Ex Ce Lent for Everyone I think. Helps the user to find the right product, no matter if he decides to spell it in one or two words.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month