There are some very good leads for me to follow up (some of which I'm delving into now and finding just how deep the rabbit hole is). Thanks to everyone who has posted.
But there are still issues to cover...
It seems to me that any counter-measure could be navigated by someone with enough resources (and hunger for the data) so I'm also going to put more thought into how to offer webmaster features that will reduce the desire to scrape the data.
Our problems are similar in nature to (but by no means the scale of) those that Google must deal with all the time - I can imagine people wanting to scrape data mainly to see if they appear in it (which is very similar to the way SERPs were/are targetted to monitor positions).
Does anyone have any experience of offering (less) data through and API or tool that has had a material effect on scraper volume? My guess is that some smarter people will take the API route but many (probably easy to block) bots will still be let loose (by idiots) on sites that they could legitimately get what they want from.
An important point is that the data on the site will not be editorial in nature - it's data that will have snippets of textual data and have inferred data by the way it is presented (e.g. if there are 20 items that match a query and those 20 are from 2 different categories, there will be 2 rows of data shown which group data into the categories - although the categories might not be shown/disclosed in any way).
Am I going to be fighting an endless battle with bots or is there a way to satisfy those that have the ability (at a higher cost to them) to circumvent counter-measures?