| 11:18 pm on Apr 18, 2006 (gmt 0)|
A match to the DMOZ index would be a Q&D way but you should download it. I don't think they like automated queries.
A true and original categorization would earn you a doctorate at a prestigeous university, a perfect taxonomy is the holy grail of AI.
| 11:45 pm on Apr 18, 2006 (gmt 0)|
Well a parallel processing massive database system with attached logic computer aka human needs years to learn a language, let alone categorize all included words. Language is largely a model system consisting of logic and unlogic rules and based massively on empirics.
Unless you have massive database resources and endless rulesets, I think this holy grail will stay a da Vinci Code for a while.
Of course you can built massive models but they might only deliver a partially pragmatical solution and not an analytical one. Which is obvioulsy wanted in this case.
Hence DMOZ or any human created tree is, I guess, the way to go. ;)
| 1:02 am on Apr 19, 2006 (gmt 0)|
You are obviously building a scraper, as keywords form the foundation of a typical scraper.
| 2:28 am on Apr 19, 2006 (gmt 0)|
Keywords form the foundation of many legitimate endeavors, I don't think it's obvious that scraping is the goal here. And I don't think it's necessarily a bad thing to try to build a profitable site for AdSense by categorizing keywords. It's bad only if no value is added and the keywords are used to generate traffic illegitimately.
| 3:07 am on Apr 19, 2006 (gmt 0)|
|It's bad only if no value is added... |
How many people who use lists of keywords as the foundations of their Web sites are "adding value"? One in a hundred? One in a thousand?
I'd guess that most keyword-driven, made-for-AdSense sites have about as much "added value" as datafeed affiliate sites do.
| 4:09 am on Apr 19, 2006 (gmt 0)|
|How many people who use lists of keywords as the foundations of their Web sites are "adding value"? One in a hundred? One in a thousand? |
I don't think you know or are qualified to judge. Why are you so self-righteously indignant? Google for one, extensively uses lists of keywords as the foundations of their business(es) and occasionally they do so without adding value.
I don't question your motives EFV, perhaps we should look at that...
| 5:26 am on Apr 19, 2006 (gmt 0)|
|Why are you so self-righteously indignant? |
I'm not "self-righteously indignant," I'm simply being realistic. And the Google comparison makes so sense at all.
| 6:15 am on Apr 19, 2006 (gmt 0)|
|And the Google comparison makes so sense at all. |
Unless you think about it. Many people, myself included, share Google's mission of organizing all knowledge.
Well, if you can't make that connection I won't belabor the point but it seems entirely sensible to me and has served me well. I use keywords all the time in an effort to organize and discover things. Perhaps you've just never thought of this. You should try it.
| 6:56 pm on Apr 19, 2006 (gmt 0)|
I'm not certain if it will take that many entries, but when I want to categorize a list of keywords, I use Mark Horrell's keyword density analyzer and find out what the most frequent 5-10 words are, then I go back to Excel and filter the list by each of those... maybe not fast enough for 50,000 kws tho.