Well, of course they can see it. The question was whether they will index it.
it's INDEXED! :|
[postimg.org...]
But would it be ranked? Maybe it is indexed but shows only when you do a query for this exact text in conjunction with the site: search.
I have seen numerous times where some content is indexed because Google shows it when you search for it alongside site: command or even then it is only shown when you click on "repeat this search with omitted results included", but is nowhere to be seen if you try to rank for it.
Therefore this text is kind of demoted. The main question is: does the existance of this repeated text harm the page or the site?
If it is just being ignored and only shown when searching for the text in quotes using site: command, then all should be fine. But if it weights against the page and having lots of these against the site as a whole, then one would want the way to not show it to Google.
As for cloaking... wouldn't blocking ANYTHING in robots.txt be in fact cloaking?
Ideally, OP should make sure that there is enough of their own unique content on the page to outweight the content copied from "major websites".
<edit>Fixed quote</edit> [edited by: aakk9999 at 11:28 pm (utc) on Mar 18, 2015]