Welcome to WebmasterWorld Guest from 54.167.83.224

Forum Moderators: open

Message Too Old, No Replies

Extracting Keywords from Word documents

     
6:52 am on Aug 25, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 28, 2002
posts:199
votes: 0


There are plenty of tools available to keyword analysis of html pages.

Anyone know of a good way to analyse & extract keywords from Word documents?

Thanks

6:14 am on Aug 26, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:June 19, 2003
posts:83
votes: 0


If you like to work online with a non Microsoft tool, you should first export the file to XML then proceed from there.

Otherwise, you can open the file using a simple windows program or a script on a windows platform using Microsoft Word COM object to extract text from the file and read it then parse it.

Luck!

Woz

6:35 am on Aug 26, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member woz is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 13, 2000
posts:4823
votes: 0


If you are simply after Single Word Density then you could copy the text into NoteTab and then use the Statistics function. Other than that I am not much help I am afraid.

Onya
Woz

6:33 pm on Aug 26, 2003 (gmt 0)

Full Member

10+ Year Member

joined:May 16, 2002
posts:223
votes: 0


A crude but effective approach would be to convert Word to HTML and then use one of the HTML tools available.