Forum Moderators: open

Message Too Old, No Replies

Natural Language Patterns

Can someone run me by this...

         

brotherhood of LAN

11:00 am on Aug 6, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I remember reading stuff about G mentioning "natural language patterns" in its understanding of webpages. It went along the lines of the bot being able to interpret the flow of natural english, or any other language for that matter.

I'm not sure how this works fitting into everything I've read about Google.

I also wanted to insinuate something about the "dictionary" of google ;) There was a thread here not too long ago about that...I'm finding it now....I don't know if it ever mentioned that "made up words" could possibly be interpreted as real words if they are used frequently enough across the net.

I was just interested in how it works, or, more exactly, the way that it works and the way that will affect Googles interpretation of english when it hits the algo! Also, if anyone feels the dictionary is linked to the number of times its found on the web (and the PR of the page most likely) then feel free to post something on topic about it.

I can't recall much talk about googles interpretation of the english language...or its "average" or "lowest common denominator" when it comes down to decrunching its findings into some sort of linear programming code. So maybe its a topic worth delving into

ciml

8:57 pm on Aug 7, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> interpret the flow of natural english

There has been talk of search engines prefering text when it looks like it's in a sentence. This could be as simple as "must begin with a capital letter and end in a full stop", or as complex as looking at the gramatical structure to try to guess if it looks natural.

I don't remember talk specifically about Google doing this.

> made up words" could possibly be interpreted as real words if they are used frequently enough across the net

Look for a brand name that is fairly unusual but used on the Web in a few pages. Mis-spell it slightly Google will sometimes suggest it. This indicates that Google is looking at page content or search terms to supplement the 'did you mean?' dictionary (even as the dictionary?).

shanz

9:48 am on Aug 8, 2002 (gmt 0)

10+ Year Member



This works for our brand name. Many slight misspellings result in being re-directed to a page of links to our site. As far as I know this brand name is not in use anywhere else on the web but our sites.

brotherhood of LAN

12:39 pm on Aug 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thought this was was going to gather dust ;)

>>complex as looking at the gramatical structure to try to guess if it looks natural.

I was aiming towards this side of the topic. Are we talking about a few simple filters here....i.e.

1. Make sure X word is not repeated
2. Make sure sentence containing X has capital letter
3. Use of stop words in sentences etc

Just wondering how Google would possibly do this...is it interpreting some sort of "norm" from the pages it finds on the web or would they pre-define the language...

I ask, mentioning the dictionary above- because yes, misspelling of more popular words get the "did you mean" SERP while a bad misspelling totally re-directs. I.E. if the dictionary works on the premise of new words being new words when websites TELL the Gbot a new word is being used (i.e. across a range of sites)....then how would this come into play (in your opinion) in the way that Google interprets the English language.

Did google make the bot understand English or are we feeding it as it goes along? :)

vitaplease

12:54 pm on Aug 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



the well meant "did you mean" SERP can be quite annoying.

If you do a search for mycompanyA (a non-dictionary word) Google can suggest
"did you mean" mycompanyB (another non-dictionary word), where mycompanyB is spelled just slightly differently and just exists more frequently in the Google database. Google could consider to state that existing site(url) names have a natural existence. At least it makes mycompanyA feel less obscure to the searcher.