Forum Moderators: open

Message Too Old, No Replies

Keyword Building - extracting duplicates?

Question about how to get rid of duplicate words in a large text file.

         

Jozsef_Poor

1:27 am on Jun 27, 2003 (gmt 0)

10+ Year Member



Hello all,

I have recently used a Meta extractor and ended up with all the keywords used on John Lewis's website. The problem is now I have all the keywords, but each has repeated at least a hundreds of times after grabbing nearly 500,000 links. Is there a tool that would only keep a single copy of each keyword or phrase within a huge text file? I have been doing Ctrl-H with replace all option, but this seems nuff!
It must be a better way. Can you please help me out here?

Cheers,
JOe

Robino

1:32 am on Jun 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



500,000 seriously?

Jozsef_Poor

1:59 am on Jun 27, 2003 (gmt 0)

10+ Year Member



I know :o(
It's a drag and there's 600 more merchants to progress.
J

Robino

2:28 am on Jun 27, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That smells! I would break them down into smaller tables and start chipping away there.

Jozsef_Poor

6:50 am on Jun 27, 2003 (gmt 0)

10+ Year Member



Smells? Oh yeah! You bet.
Working on an Intel PIV, but these tasks bringing the whole thing on its knees. I cannot believe that no one had this problem before i.e. filtering out multiple occurence of words/phrases before. There must be a way of doing this the 'human way'! C'mon som one, pls come forward with that life saver tool before I go totally potty...
J

Jozsef_Poor

7:09 am on Jun 27, 2003 (gmt 0)

10+ Year Member



By the way, for single words/keywords there's been one rather good (commercial) solution here: [softexe.com...]
It's a nag that one cannot do phrases though... :o(
J