We need to cross-reference 2000 categories, covering the full spectrum of business categories that are usually found in the Yellow Pages. That's a lot of manual work.
How would you do it?
We've decided to combine our existing manual taxonomy with bespoke software which analyses content from websites in each area (through search engine API's - not spidering actual sites). The algorithm seems to work quite well, but some anomolies pop up that are mostly explained if you think a bit deeper. We are still ranking all manual matches above automated ones (using the automation to order the manual ones then add missing categories that reach a threshold score).
[edited by: Webwork at 2:34 am (utc) on Feb. 11, 2007] [edit reason] Charter [webmasterworld.com] [/edit]