Based on my analysis of the quantity of sites using jQuery and AngularJS the current reign of dumb scrapers is about to end.
Some applications that actually display lots of data on your screen look like this to your average bot:
<html lang="en" ng-app="phonecatApp">
<title>Google Phone Gallery</title>
<link rel="stylesheet" href="css/app.css">
<link rel="stylesheet" href="css/bootstrap.css">
<link rel="stylesheet" href="css/animations.css">
<div ng-view class="view-frame"></div>
What this means to scrapers is necessity is going to push scrapers, or already has, to use tools like PhantomJS to scrape.
Some of the only clues website owners will have are:
* Is the bot using the default user agent string?
* Is the bot hosted in a data center IP range?
* Is the bot hosted on Linux?
Linux will be less of a clue as more end users dump Windows.
* Speed or duration of page requests to site.
I'm predicting soon the data center IP range will be about the only clue left unless it's a greedy high speed scraper so take advantage of the low hanging fruit while it lasts because the circumstances of the evolving web are going to force scrapers to evolve to meet the challenges of the current website technology.