Forum Moderators: Robert Charlton & goodroi
Following are my query regarding the LSI
How LSI is more effective than popular search engine technique,
How SVD statistical data works with matrix
Can any one tell me steps to implement LSI to my site?
At present LSI has less weight in Google algorithm but what is the possibility of increasing that weight age for G, Y and Msn?
share your knowledge about LSI implemetation
For the past two years I've work with semantic factors for urls that seem like they "should" rank well, but they don't. By widening the variety of related vocabulary on the page I often see significant upward movement in the SERPs. And the variety of search phrases that bring in traffic also increases - naturally I suppose, there is a greater variety of related text on the page.
I often will survey the top "however many" results to see what words appear frequently on those pages but do not occur on mine. And where it makes sense, I add that language. Many of the pages I change were written by an SEO-oriented copy writer in years past.
This kind of writing was extremely focused in areas like keyword density and prominence -- factors that were arguably quite important in the 1990s, but not so much today. And that style of SEO writing has the unfortunate result that synonyms and words that are related to the page's "theme" are often avoided in favor of the target keyword. Yeah -- it's keyword stuffing.
Now I say take the keyword blinders off the copy writers, and let their language be more natural. In most cases, then, you don't need any heavy analysis to get the copy to rank well.
[edited by: tedster at 10:49 am (utc) on Sep. 20, 2006]
[edited by: Crush at 6:20 am (utc) on Sep. 19, 2006]
can any one make me understand
Well, tedster took a crack at telling you a reasonably practical response to Google's use of semantics; might be worth another read. If you were hoping LSI meant you could make mechanical calculations that would predictably let you affect your Google SERPs, yer just dreamin', I'm afraid.
Next it determines the value of that page by the amount of links pointing at it and there you have it.
It's not that simple. Google runs the PageRank Algo, which comes from Google's cofounder, Larry Page, who worked on it with Serget Brin, while still in college. The only public and known version of this algo is about 10 years old. We know the algo, but we don't have the websites, the links, nor the tweaks.
You can read more at [en.wikipedia.org...]
The result is not surprising.
[edited by: Halfdeck at 8:02 am (utc) on Sep. 19, 2006]
>> we can see difference by typing (~)sign before your keywords.
>> follows term-document matrix,SVD,K lsi space matrix.
>> It finds more relevant information than other methods.
>> LSI is 30% more effective than popular word matching method.
now above from all How i can implemt LSI to optimize my site
I need to focus in which area to optimize my site based on LSI… is it content?
Following things will be done by search engine or we need to calculate it…..
-Obtain term-document matrix.
-Compute the SVD.
-Truncate-SVD into reduced-k LSI space.
-K-dimensional semantic structure
-Similarity on reduced-space:
-Term-term
-Term-document
-Document-document
I think you need to re-read Tedsters (excellent) post, it summarises everything that you need to know about LSI and semantic analysis.
Your question:-
i came to know through some tutorials following fact about LSI>> we can see difference by typing (~)sign before your keywords.
>> follows term-document matrix,SVD,K lsi space matrix.
>> It finds more relevant information than other methods.
>> LSI is 30% more effective than popular word matching method.now above from all How i can implemt LSI to optimize my site
Is actually answered right here:-
Now I say take the keyword blinders off the copy writers, and let their language be more natural.
You're trying to second-guess a computer algo and that's where most people who want to get into SEO (post 2001) make their first mistake. What search engine algos seek to achieve is to rank in order of:-
1. Authority
2. Quality
3. Topicality
That's what you need to think about. Unless you're into automated content production*, and looking at your previous posts I don't think you are, then you just need to produce the real deal - authoritative and topical quality content.
Let the engines worry about how to automate the process of investigating that content by using LSI. That's their business, and they'll keep changing it until they get it right which is just going to leave you chasing your tail.
If you can produce pages that satisfy the above 3 things, assuming that you understand the basics of SEO - crawlability etc, then your pages will rank well and bring in good quantities of traffic.
TJ
* - if you are into automated content production and you're looking to trick Google into thinking spam is quality content, then you'll really need to get a very good grasp of the LSI process by reading as many white papers on it as you can. There are some good links in a thread by BakedJake in the supporters forum somewhere titled, I think, "Things I have been reading lately".
there was an interesting thread on LSI under
[webmasterworld.com...]
but I just found that all links from there towards technical sites are broken.
I read between your lines that you have completely misunderstood what Latent Semantic Indexing actually is, and perhaps tedster might provide an existing introductory edu-source, instead of overwhelming you with his highly sophisticated knowledge on present state of the art;)
I just found
[lsa.colorado.edu...]
for a quick start. Also
[www3.interscience.wiley.com...]
The approach is to take advantage of implicit higher-order structure in the association of terms with documents (semantic structure) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising. © 1990 John Wiley & Sons, Inc.
Hope this helps.
edited: Oops, that looked like an edu-link at fist sight. Hope within the tos.
[edited by: Oliver_Henniges at 8:27 pm (utc) on Sep. 20, 2006]
Right there in a nutshell.
Usually when you talk about a subject you incorperate a general set of words. Typically nouns/verbs that distinguishes what you are talking about. You can tell an authority from a non-authority simply by looking at some of the language patterns. An authority tends to have a more rich vocabulary on the subject and an non-authority tends to have a shallow vocabulary. IF you create content for users then you tend to use vocabulary that is rich about the subject. This goes not only for 1 page but every page in your theme.
A good example is a recipe. Typically a page that contains an actual recipe will have words such as ingredients, cup, ounce, teaspoon, table spoon, cook, oven, degrees, etc. A page that has a meatloaf recipe will contain similar words plus ground beef, peppers, onions, etc. Now a page that just references a meatloaf recipe may contain the term meatloaf and recipe but will very well lack in the other words necessary to distinguish that it is actually a recipe or not. Now it may link to a recipe and that may be included as a factor as well as sites linking in that contain recipes may also be a ranking factor.
Simply by creating an actual meatloaf recipe for your users will pretty much take care of it. Not always and this isn't the only ranking factor. Sometimes we do choose words over more related words. Try and catch yourself and make corrections. But your first goal is to creat content for your visitors in such a way that shows your knowledge on the subect that is of value to them. Choose your words wisely to convey the meaning of the subject you are talking about. Look to satisfy their understanding first. The rest should follow.
You know the ones:
"When considering the purchase of <widgets> there are many factors to consider. What one person considers important may be different than another, so one must be careful when choosing a <widget> manufacturer to find one that meets your particular needs and goals."