homepage Welcome to WebmasterWorld Guest from 54.204.182.118
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How to implement LSI technique for my site
How to implement LSI technique for my site
jameswatt

5+ Year Member



 
Msg#: 3085334 posted 12:35 pm on Sep 16, 2006 (gmt 0)


I heard that Google is adapting LSI technique in search engine algorithm to display relevant data.

Following are my query regarding the LSI

How LSI is more effective than popular search engine technique,
How SVD statistical data works with matrix
Can any one tell me steps to implement LSI to my site?
At present LSI has less weight in Google algorithm but what is the possibility of increasing that weight age for G, Y and Msn?

share your knowledge about LSI implemetation

 

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3085334 posted 7:50 pm on Sep 16, 2006 (gmt 0)

< Note: Over the past two years, every time asked a Google engineer about whether they used LSI,
they said they did not. Finally I got bit sharper -- LSI is a specific method. Just because they
are not using that specific method (there may even be patent questions involved) doesn't mean
that Google is not using various forms of semantic analysis. I'd say they definitely are. They've
purchased entire companies that specilize in semantics, such as Applied Semantics in 2003. >

For the past two years I've work with semantic factors for urls that seem like they "should" rank well, but they don't. By widening the variety of related vocabulary on the page I often see significant upward movement in the SERPs. And the variety of search phrases that bring in traffic also increases - naturally I suppose, there is a greater variety of related text on the page.

I often will survey the top "however many" results to see what words appear frequently on those pages but do not occur on mine. And where it makes sense, I add that language. Many of the pages I change were written by an SEO-oriented copy writer in years past.

This kind of writing was extremely focused in areas like keyword density and prominence -- factors that were arguably quite important in the 1990s, but not so much today. And that style of SEO writing has the unfortunate result that synonyms and words that are related to the page's "theme" are often avoided in favor of the target keyword. Yeah -- it's keyword stuffing.

Now I say take the keyword blinders off the copy writers, and let their language be more natural. In most cases, then, you don't need any heavy analysis to get the copy to rank well.

[edited by: tedster at 10:49 am (utc) on Sep. 20, 2006]

jameswatt

5+ Year Member



 
Msg#: 3085334 posted 5:58 am on Sep 19, 2006 (gmt 0)

can any one make me understand about the matrix calculation for LSI and how i can implement LSI,LSA to optimize my site

Crush

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3085334 posted 6:20 am on Sep 19, 2006 (gmt 0)

Does not work. Google is just a computer that takes all the words on your pages and indexes them. Next it determines the value of that page by the amount of links pointing at it and there you have it. It dilivers the page that is most relevant by these factors in the main. LSI was talked about a lot a couple of years ago and we tried some stuff that never really worked. Keep it simple. Right words on the page and lots of incoming liks.

[edited by: Crush at 6:20 am (utc) on Sep. 19, 2006]

ronburk

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3085334 posted 6:45 am on Sep 19, 2006 (gmt 0)

can any one make me understand

Well, tedster took a crack at telling you a reasonably practical response to Google's use of semantics; might be worth another read. If you were hoping LSI meant you could make mechanical calculations that would predictably let you affect your Google SERPs, yer just dreamin', I'm afraid.

cavendish

5+ Year Member



 
Msg#: 3085334 posted 6:48 am on Sep 19, 2006 (gmt 0)

Next it determines the value of that page by the amount of links pointing at it and there you have it.

It's not that simple. Google runs the PageRank Algo, which comes from Google's cofounder, Larry Page, who worked on it with Serget Brin, while still in college. The only public and known version of this algo is about 10 years old. We know the algo, but we don't have the websites, the links, nor the tweaks.

You can read more at [en.wikipedia.org...]

Crush

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3085334 posted 7:38 am on Sep 19, 2006 (gmt 0)

PR does not come into play so much now IMHO.

Halfdeck

5+ Year Member



 
Msg#: 3085334 posted 8:00 am on Sep 19, 2006 (gmt 0)

If by LSI you mean related words on a page (e.g. for main keyword "car", words like "Ford", "wheels", "steering", "gas mileage"), I just ran a rudimentary test that pits an original page with related keywords (taken from Google Adwords Keyword Tool, or whatever its called) against a spammy meaningless page that repeats the keyword 12+ times. Both pages are linked once from domain root and HTML structure is identical. The only difference is the text on each page.

The result is not surprising.

[edited by: Halfdeck at 8:02 am (utc) on Sep. 19, 2006]

jameswatt

5+ Year Member



 
Msg#: 3085334 posted 10:21 am on Sep 20, 2006 (gmt 0)

i came to know through some tutorials following fact about LSI

>> we can see difference by typing (~)sign before your keywords.
>> follows term-document matrix,SVD,K lsi space matrix.
>> It finds more relevant information than other methods.
>> LSI is 30% more effective than popular word matching method.

now above from all How i can implemt LSI to optimize my site
I need to focus in which area to optimize my site based on LSI… is it content?
Following things will be done by search engine or we need to calculate it…..
-Obtain term-document matrix.
-Compute the SVD.
-Truncate-SVD into reduced-k LSI space.
-K-dimensional semantic structure
-Similarity on reduced-space:
-Term-term
-Term-document
-Document-document

trillianjedi

WebmasterWorld Senior Member trillianjedi us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3085334 posted 10:38 am on Sep 20, 2006 (gmt 0)

Hi James,

I think you need to re-read Tedsters (excellent) post, it summarises everything that you need to know about LSI and semantic analysis.

Your question:-

i came to know through some tutorials following fact about LSI

>> we can see difference by typing (~)sign before your keywords.
>> follows term-document matrix,SVD,K lsi space matrix.
>> It finds more relevant information than other methods.
>> LSI is 30% more effective than popular word matching method.

now above from all How i can implemt LSI to optimize my site

Is actually answered right here:-

Now I say take the keyword blinders off the copy writers, and let their language be more natural.

You're trying to second-guess a computer algo and that's where most people who want to get into SEO (post 2001) make their first mistake. What search engine algos seek to achieve is to rank in order of:-

1. Authority
2. Quality
3. Topicality

That's what you need to think about. Unless you're into automated content production*, and looking at your previous posts I don't think you are, then you just need to produce the real deal - authoritative and topical quality content.

Let the engines worry about how to automate the process of investigating that content by using LSI. That's their business, and they'll keep changing it until they get it right which is just going to leave you chasing your tail.

If you can produce pages that satisfy the above 3 things, assuming that you understand the basics of SEO - crawlability etc, then your pages will rank well and bring in good quantities of traffic.

TJ

* - if you are into automated content production and you're looking to trick Google into thinking spam is quality content, then you'll really need to get a very good grasp of the LSI process by reading as many white papers on it as you can. There are some good links in a thread by BakedJake in the supporters forum somewhere titled, I think, "Things I have been reading lately".

Oliver Henniges

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3085334 posted 8:18 pm on Sep 20, 2006 (gmt 0)

jameswatt, LSI is not a technique you may implement on your site in order to improve your ranking. LSI is a very complex algo-framework, which search-engines additionally use to decide whether your content is more relevant than that of your competitors on a given searchphrase.

there was an interesting thread on LSI under
[webmasterworld.com...]
but I just found that all links from there towards technical sites are broken.

I read between your lines that you have completely misunderstood what Latent Semantic Indexing actually is, and perhaps tedster might provide an existing introductory edu-source, instead of overwhelming you with his highly sophisticated knowledge on present state of the art;)

I just found
[lsa.colorado.edu...]
for a quick start. Also

[www3.interscience.wiley.com...]

The approach is to take advantage of implicit higher-order structure in the association of terms with documents (semantic structure) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising. © 1990 John Wiley & Sons, Inc.

Hope this helps.

edited: Oops, that looked like an edu-link at fist sight. Hope within the tos.

[edited by: Oliver_Henniges at 8:27 pm (utc) on Sep. 20, 2006]

arubicus

10+ Year Member



 
Msg#: 3085334 posted 9:31 pm on Sep 20, 2006 (gmt 0)

1. Authority
2. Quality
3. Topicality

Right there in a nutshell.

Usually when you talk about a subject you incorperate a general set of words. Typically nouns/verbs that distinguishes what you are talking about. You can tell an authority from a non-authority simply by looking at some of the language patterns. An authority tends to have a more rich vocabulary on the subject and an non-authority tends to have a shallow vocabulary. IF you create content for users then you tend to use vocabulary that is rich about the subject. This goes not only for 1 page but every page in your theme.

A good example is a recipe. Typically a page that contains an actual recipe will have words such as ingredients, cup, ounce, teaspoon, table spoon, cook, oven, degrees, etc. A page that has a meatloaf recipe will contain similar words plus ground beef, peppers, onions, etc. Now a page that just references a meatloaf recipe may contain the term meatloaf and recipe but will very well lack in the other words necessary to distinguish that it is actually a recipe or not. Now it may link to a recipe and that may be included as a factor as well as sites linking in that contain recipes may also be a ranking factor.

Simply by creating an actual meatloaf recipe for your users will pretty much take care of it. Not always and this isn't the only ranking factor. Sometimes we do choose words over more related words. Try and catch yourself and make corrections. But your first goal is to creat content for your visitors in such a way that shows your knowledge on the subect that is of value to them. Choose your words wisely to convey the meaning of the subject you are talking about. Look to satisfy their understanding first. The rest should follow.

jtara

WebmasterWorld Senior Member jtara us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 3085334 posted 11:28 pm on Sep 20, 2006 (gmt 0)

And avoid weasel words. One of these days Google will figure out how to avoid spammy sites with weasel words.

You know the ones:

"When considering the purchase of <widgets> there are many factors to consider. What one person considers important may be different than another, so one must be careful when choosing a <widget> manufacturer to find one that meets your particular needs and goals."

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved