Forum Moderators: open

Message Too Old, No Replies

Latent semantic indexing

How can you start to use it?

         

Pikin_It_Up

12:32 pm on Apr 6, 2004 (gmt 0)

10+ Year Member



I have been reading a lot of posts (here and elsewhere) relating to Latent semantic indexing.

I understand that semantics is basically the use of words, and the relationships that occur between them.

The problem I have found so far is trying to find out what relationships occur between words. For instance... If I were to produce a site related to widget improvement technologies, how would I find out what words semantically link to that term. Is there an online dictionary (i really am guessing here) that I can reference my desired term with?

This is a topic that I am really interested in and have read a good few articles about it. The problem is, that all of the articles I have read are assuming that someone already has a good knowledge of Latent semantic indexing.

Can anyone give me an idiots description of what Latent semantic indexing really is? If not can someone sticky me an address that coveres the subject please.

Cheers

P

engine

4:35 pm on Apr 6, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hmmm, if you've read up about it, I'm not sure how best to answer you.

In basic terms, it's the mathematical technique for information filtering. The great thing about it is that it's not reliant upon language.

HTH

Pikin_It_Up

5:02 pm on Apr 6, 2004 (gmt 0)

10+ Year Member



Cheers engine, I found a really article about it earlier on. I can Sticky you the URL if you'd like.

shorebreak

8:32 pm on Apr 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



[kwmap.com...] seems to be an interesting keyword discovery tool that does some semantic mapping.

I'm an idiot, though, so I might actually be off-topic.

[edited by: msgraph at 4:50 pm (utc) on April 7, 2004]

maherphil

12:52 am on Apr 9, 2004 (gmt 0)

10+ Year Member



The help section hardly explains how it works, does anyone know where they get their data?

I used to work for a company that has building giant hierarcies of knowledge and using patented algorithms to tranverse them and find the distance between a query and document.

Really fun stuff, stick me if you want to talk more about this.

Anyone know other systems?

webnewton

12:24 pm on Apr 9, 2004 (gmt 0)

10+ Year Member



Well Picking

LSI as you know is a technique that helps in deciding a set of interelated keywords.There are mathematical algorithms out there which help to define the mutual relationship between a set of keywourds.
Say for example while searching for "saddam hussain" this software would find a lots of webpages where there will be a mention of "gulf war". No. of occurence of these keywords together will help the algorithms decide how closly associated are these keywords. If they are close enough you''re sure to find a article on gulf was while you search for "saddam hussain"

Is there an online dictionary (i really am guessing here) that I can reference my desired term with?

The whole web is the dictonary. The LSI sofware would search the web and find all interelated keywords.
Little wonder while we search for "mars" on google NASA's site appears on the top.

Robert Charlton

5:40 am on Apr 11, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Here's a link to the best article [javelina.cet.middlebury.edu] on LSI I've seen. It was posted in the Google forum, so I think it should be OK here.

Is there an online dictionary (i really am guessing here) that I can reference my desired term with?

Check the Google AdWords Sandbox for broad match phrases, which will show you, at least, what Google considers a broad match for a particular phrase.

Well before the idea of Latent Semantic Indexing had registered in my brain, I started the following thread on the use of the Sandbox tool. It never got the response I'd hoped for....

SEO search phrase research with new AdWords Keyword Sandbox
[webmasterworld.com...]

Robert Charlton

5:43 am on Apr 11, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



PS - I'd stay away from kwmap... They have a bad rep, and what they do isn't LSI. It's more the kind of clustering that Vivisimo does AltaVista used to do... which is more sort of a lateral search than an analysis of semantic usage.

[webmasterworld.com...]

There was a thread about them in the Supporters Forum that really laid out just how slimey their spamming techniques have been. They're trying to get you to link to them.

Their display is very pretty, by the way, but I'm not sure it means much.

Marcia

6:52 am on Apr 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doesn't mean much as Robert says, and email groups were flooded with posts a while back about people getting unsolicited mail from them requesting links.

Nifty way to amass Page Rank from unsuspecting people for personal usage is how I've interpreted that routine. Nothing personal intended, but I always do wonder where sites like that, doing what they're doing, will end up linking to down the road, and/or how much the price of text links /aka Page Rank aka/ "advertising" will be down the road when a certain level of PR is reached, enough to be marketable.

Enough of that, though. As Robert says:

>>I'm not sure it means much.

I've kind of always loved the expression "diddley-squat" - kind of says it all, plain and simple like. :)

Jumping into "semantics" is nothing more than increasing richness in our own use of vocabulary whilst increasing the breadth and depth of content within websites.

Good instructors of the "old school" teaching English 101 have been telling their students for years that the best way to learn to write well is to do a lot of reading, both to develop style and to increase the richness of their own vocabulary. Nothing in the world has happened to change that, and the same principles apply online as off.

It's flat out old-fashioned keyword research for starters, nurturing an intuitive grasp of inter-relationships in vocabulary usage, and write, write, write.

glengara

10:11 am on Apr 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd agree with Marcia, particularly in "..increasing the breadth and depth of content.."

IMO, the primary goal in the use of semantics by G is to help determine a page's topic without relying on the presence of keywords.