jmcc, a formidable task but one that you're familiar with. Why not try a city, like London to start off with? The UK is a big country and makes the task bigger, London would be a good 'proof of concept'.
The first issue I see is knowing where to spider and the redundancy of discarding 'non-UK' sites. I guess you would want to start off with a seed-list of sites and follow the links a couple of levels deep and you could have a good proportion of related sites.