When it comes to content theft, I find search engines are very helpful. I'm a copywriter as well as writing stuff for my own byline, so I regularly check the web to see 1) if anyone has ripped off my bylined work 2) if anyone has ripped off copy I wrote for a client and 3) if a client's existing web copy, which he wants rewritten and updated, really belongs to him (because I don't deal in stolen goods). I just google in unique 7 to 10 word phrases from the text and see what comes up.
Caveat: in some industries there is plenty of information in the public domain that businesses use to beef up their sites for SEO. For example, an HVAC business may use EPA reports on their site. So sometimes you'll get multiple hits on key phrases and it will look like everybody is ripping everybody else off - check to make sure it really is someone's intellectual property and not in the public domain before you cry foul.
Sorry, don't know how you'd check for theft of images.
Unless the thief is dumb enough to hotlink the image, the only way to find stolen images is to spider the entire web. I recall hearing about an image spider that checked for images with a hidden watermark, but AFAIK that's a paid service. Usually, it makes most sense to let the search engines do what they do best (spider as much of the web as they can) and use them to spot ripoffs.
I doubt if running your own spider would be practical unless you expected the theft to occur on a specific set of sites.
You could use the major search engines to generate a list of sites based on keywords/topics, then spider them for images. That's a lot of effort, though.
Visibly watermarking your images might be one way to discourage theft if you feel they are highly likely to be stolen.
Watermarking is something I'm already doing for my clients.
If searching for key phrases is the best way to do things then isn't there a significant gap in how many web sites you check compared to how many are actually out there?
Also, how would you search for images if the filename, alt tag, and file size have all been modified?
Just some things to consider. Is there a service that can do this leg work for you?
Certainly spidering a limited number of sites based on keyword lists would leave most of the web unchecked. Spidering the entire web takes a lot of horsepower, though. And if a thief changes all the characteristics that you mention,
You might be better off not putting the images on the web if theft is such an issue. Image vendors like Getty address the problem by showing visitors only small, heavily watermarked versions of their images. (I've seen website template vendors use this same approach, since showing a full-size, high quality example would allow visitors to copy it.) I think you are focused on the wrong approach by assuming you'll be able to find the thieves after they copy them. If you search Google et al for "image protection" you'll find plenty of possible approaches, none of which is really foolproof.
I'm not looking for a "fix". I've done all the heavy watermarking and image manipulation I can do without putting up red X's.
I understand the images if available will be copied. I understand "image protection". I'm concerned about finding the criminals.
I'm curious as to whether or not there is even a foolproof way to find thieves, not a foolproof way to prevent theft (image or copy).
anybody know the data string length ( max chars) accepted in for example google? I would have thought that it ought to be possible to open your image in hex and then cut and paste into an "hex speaking engine" somewhere a representative peice of your hex code surrounded by "x"...if it finds a match for a string of say 200 char that is likely to be a rippoff of yours
Come to think of it I might have an idea there ...instantly copyrights idea ...
Gonna have to do some serious programming tonite or timewasting depending on how you look at these things ..: )