turbocharged - 12:16 pm on Apr 27, 2013 (gmt 0)
Thanks for the .htaccess tips TOI. I've included all of them in the .htaccess file, with the exception of the custom 403. Outside of the many Appspot and Blogspot domains, there are far too many copies or partial copies of this client's site floating around on Wordpress, Tumblr and other free blogging places to do custom redirects.
By the middle of next week we should have all the DMCA notices sent out to the remaining blogging platforms. At this stage I would like to focus on prevention before I get back into the design work I originally was hired to do.
The client's site is verified through Google+ as a publisher with a linked G+ icon on the homepage and linked/verified to his WMT account. That does not appear to make any difference to prove authenticity/ownership as it is copied too and displayed on Appspot proxies and full page copies on many different free blogging sites. The Appspot pages are still ranking and the client site is in the omitted results still.
Does Google read metadata (EXIF/IPTC) of images? Would this be a viable way to help an algorithm, with no concern of the originator of content, determine ownership? Yes, image meta data can be changed, but I doubt most scrapers would devote much if any time to the task. And if they do, I'd like to consume as much of their time during the process as possible. It also may make it easier to substantiate DMCA takedowns when a digitally signed image is discovered on partial page copies.