Here's some more to the thought...
Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device.
When is my work protected? [copyright.gov]
For more info on the subject I found this chapter informative:
Copyright Law Chapter 2 [copyright.gov]
If I discover Page A contains 'this content' and 'this content' is not already in my system, I can reasonably conclude Page A is the copyright holder (originator) of the content in the absence of direct, contradictory information.
If I later discover Page B contains 'this content' I can comply with the DMCA and copyright law by treating the secondary discovery as 'apparently infringing' and not including it in my results (and definitely not promoting it within my system), because I am proactively attempting to not profit from or promote what is reasonably (to me) apparently infringing work.
If I later discover Page C also contains 'this content' I can treat it in the same manner as Page B for the same reasons.
What I do not know how I could do is discover 'this content' on Page A originally and then replace it with Page B in the absence of direct, contradictory information regarding copyright ownership being supplied to me and claim I am not trying to profit from what should, to the best of my knowledge, be considered apparently infringing content?
If Page A contains 'this content' and the content is freely available, then I have done no one any harm or wrong by treating other Pages containing 'this content' as apparently infringing, because even though it's freely available and there is not infringement, I have provided my visitors with the resource they were seeking and the information they were seeking and they really do not need it in triplicate.
If Page A contains 'this content' and the content is copyrighted by the owner of Page A, then I have correctly treated it as the Copyright Holder's work and treated the other 'duplicates' (apparently infringing pages) correctly by proactively removing access to them to the best of my ability.
If Page A contains 'this content' and Page B is the true copyright holder of 'this content', then I have done what I should according to the DMCA to the best of my knowledge in the absence of contradictory information, and at the time of direct, contradictory information being provided and being made aware Page B's owner is the true Copyright Holder I can then replace Page A with Page B, having done exactly what the DMCA says I should when content is 'apparently infringing' and Page B's owner has no recourse, because I did as I should and was an unwilling participant in any type of infringement.
What I cannot figure out how I could do is remove Page A where 'this content' was initially discovered and replace it with the later discovered Page B also containing 'this content' without direct, contradictory information stating Page B is the true owner of the copyright to 'this content', because in my sole opinion Page B is 'more important' to myself or my visitors?
How can the algorithmically (heuristically) determined importance of a page (site) possibly determine the origination of the content on the page when the discovery date of the 'this content' on each page directly contradicts the perceived importance of the pages?
To say Page B should replace Page A, because Page B is 'more trusted' or 'more popular' or 'more expected' to be seen by either myself or visitors does not negate or change the fact 'this content' was originally discovered on Page A, not Page B and the owner of Page B has not provided any direct, contradictory information to outweigh the discovery date of the content on each page, and according to Copyright Law, the original creation of a work designates the copyright holder, not the 'algorithmically (heuristically) perceived importance of the site or page' to myself or my visitors. The DMCA basically says I must proactively remove or disable access to apparently infringing work to the best of my ability to qualify for protection.
So, how could anyone possibly remove the Page originally discovered containing 'this content' (Page A), replace it with Page B, and not be promoting and profiting from, what is to the best of their knowledge, apparently infringing content?
And, why would anyone remove the Page originally discovered containing 'this content' (Page A) and replace it with Page B if there was nothing for them to gain (profit from) by promoting what should reasonably be determined is, to the best of their knowledge, apparently infringing content?
I can't think of a good answer to either of those two questions, except there is a profit of some type from the promotion of what could (should IMO) reasonably and rationally be determined to be apparently infringing content...