Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Extracting Keywords from Word documents

6:52 am on Aug 25, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 28, 2002
votes: 0

There are plenty of tools available to keyword analysis of html pages.

Anyone know of a good way to analyse & extract keywords from Word documents?


6:14 am on Aug 26, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:June 19, 2003
votes: 0

If you like to work online with a non Microsoft tool, you should first export the file to XML then proceed from there.

Otherwise, you can open the file using a simple windows program or a script on a windows platform using Microsoft Word COM object to extract text from the file and read it then parse it.



6:35 am on Aug 26, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member woz is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 13, 2000
votes: 0

If you are simply after Single Word Density then you could copy the text into NoteTab and then use the Statistics function. Other than that I am not much help I am afraid.


6:33 pm on Aug 26, 2003 (gmt 0)

Full Member

10+ Year Member

joined:May 16, 2002
votes: 0

A crude but effective approach would be to convert Word to HTML and then use one of the HTML tools available.