Not that easy, as can be seen if one visits the spider forums. He could certainly use a robots.txt file or noindex/nocache attributes, but those are certainly no guarantee.
Isn't part of the issue here the old argument about opt-in vs. opt-out? The SE's have established a defacto system that requires one to opt out of their index, but early on there were good arguments that publishers should opt-in.
Cast in those terms the ramifications hardly change everything, but they could certainly change the landscape in terms of who is best able to monetize their own published material - publisher or parasites.