I understand your point tedster about s3 forum's robots.txt, but should Google be even indexing this appspot URL? If you search for site: example.appspot.com (WHERE EXAMPLE is the specific URL that is showing up here), you will find many sites being cloned AND INDEXED by Google.
It even indexes twitter pages, which includes the twitter page of "the superbowl proposal SEO guru".
Try this URL example.appspot.com/twitter.com/TWITTERHANDLE .. replace example and TWITTERHANDLE.
Google must be able to identify cloned content from twitter, when even the URL is maintained as-is.