Welcome to WebmasterWorld Guest from 18.104.22.168
How does your duplicate content on those pages affect your site's G ranking?
We don't know, we just hope G don't ever get it wrong.
I've used G alerts since they began, but I've never seen an archive.org url cited in an alert... until today.
It's not an exact match of the phrase, just parts of it, nevertheless the point is, it is an
[archive.org...] url and so must be incorporated in the Google index at some level.
Deliberate or a slip up?
Another very good reason to get your sites pulled from archive.org as we have done.
20+ sites, but It wasn't too laborious a process.
Completed in 3-4 days. Gone from archive.org, wayback, and the dreaded alexa :)
The URLs can still get URL-only listings if people link to them, but I've not seen anything else.
Was it a URL-only listing you had in Google alerts? (Note that sometimes such listings have a title of the link text pointing to them.)
The only other possibility is if archive.org are accidentally exposing their listings.
www.archive.org/stream/<rest of url>
The G serp shows a normal; Title, url and a snippet.
The target page in www.archive.org is cached in G, and the cache shows the G Alert terms.
The G cache url is formatted like this:
[22.214.171.124...] of url>
It looks to me like trouble at Google, and for everyone still unfortunate enough to be in archive.org