Forum Moderators: open
That said, I also have other articles reprinted that also appear on my own site, and haven't had any problems with them being removed for duplicate content, unless some of the reprints have been removed for that reason (but quite a few are indexed).
Could you explain to me what is considered as "near duplicate content"? One of my client's editor stupidly enough copied over 4-5 pages of contents from an American website, and it seems the site has been filtered out for this update. Is this fatal? Or there is something I can tell them to do to get back into Google? Thanks.
While working on a domain, I put up an affiliate program's full page ad, with a very slight amount of customization.
When I do a cache view on that page, I get ANOTHER site's cache of the page, not mine. My page is greybarred now.
Apparently it isn't just 100% duplication on your own sites, but near duplicates of anything else out there.
Anyone else see this?
Alex
I wouldn't chalk that up to duplicate content filters just yet. During update periods, it is quite common for Google to show the a different page for the cache. It is also quite common for the toolbar to not function properly during an update.
Could you explain to me what is considered as "near duplicate content"?
Near duplicate content is duplicate content that has been altered somewhat in order to hide the fact that it is duplicate.
True duplicate content is a naturally occuring phenomenon. Multiple domains set up, mirror sites for bandwidth issues, etc..
With these types of situations, Google's goal is not to penalize people, it is simply to make sure that two exact copies of the same page don't show up next to each other in a SERP. Typically, in the past, this has been handled by "the highest PR wins."
Penalties (as in an entire site gets a PR0) is a different story. These usually involved sites who have intentionally duplicated content across multiple domains in order to try and gain advantage in SEPS. These pages are usually slightly altered in order to convince Google they are different.
[edited by: WebGuerrilla at 8:27 pm (utc) on May 13, 2003]
It is also impossible for Google to compare every page directly to every other page. They must have an algorithm set up to compare pages on similar sites.
What I don't know is if the PR from those sites is passed on before they get filtered.
thats terrible...I had a competitor steal 197 pages of content from my site and change the words around a bit and in this last update, they shot up in the rankings....I called them about 10 times and everytime I call and ask to speak with the domain name owner I get "oh he's not here right now, sorry"
About the guy re wording your content.. Umm i'm sure no one here has ever done that before <grin>
Also, if I had stolen 197 pages from someone, I'd be certain to say the the domain owner wasn't there either, so chances are, you were talking to him/her - or the person answering the phone was instructed to screen calls.
It's not alot of fun to do, but putting it in writing is a good way to get people's attention.
Alex
The big issue here is that the penalty is not a PR0, but a gray bar (removed from the Google index)