Here's a story you may find interesting. I discovered in my logs, some references from eGoto dot com, and when I backtracked to the page, I discovered they had completely copied my page, put it on their site, with a small disclaimer at the bottom saying it was a cached version.
I was not pleased, since this is a violation of my copyright, as far as I'm concerned. I emailed them (with some difficulty), and got a reply saying that they removed one instance of a copied page, but also said that their spider would get my page again anyway, next time they spider newly updated dmoz info.
The guy suggested I could prevent this, by blocking "EgotoBot/4.8" in my robots.txt or with a meta noindex specifically on each page.
Is it my responsibility to prevent them from violating my copyright? I don't think so, but is there another point of view? It's somewhat analagous to Google's cache I suppose, but probably stretching the definition of "caching" way too far.