Forum Moderators: Robert Charlton & goodroi
I'm gonna go back to an earlier post here:
ElixirDuplicate Content has become a major headache for us. I am referring to stolen content. I spend hours each week sending out cease and desists. When there is a major issue such as another SEO company stealing entire sections of the site our rankings plummet. We resolve the issue and our rankings come back. I find the SEO's stealing the content and re-producing it on their site talking about their ethical approach highly offensive. Has anybody ever sued over duplicate content. I wonder if there is a legal case to sue somebody and make it a high profile case to try and deter companies stealing content as a short cut. I am not talking about scrapers either although that happens too I am talking about entire sections of content stolen with the name of the Copmpany deliberately changed. The greatest frustration of all is that when these unscrupulous thieves steal our content our rankings plummet.
The DMCA procedure takes too long. Has anybody had any experience of reporting the site to the plagiarists ISP?
At least you're coming back in. This filter is really bad on infringement. Maybe sometime soon someone at G will wake up for issues where there is a Registered Copyright or use a realistic dating system for the content, and or loosen the filter on sites that complain and file DMCA complaints.
Google does not do the proper job at filtering out the original content, nor in protecting sites from "infringement ranking attacks".
I am afraid you will only become disheartened with any and all approaches to infringed content, especially if they keep happening.
I also believe this may be a collaborative effort in some cases to purposely take out sites since Google makes it a bit easier now.
I have found contacting ISPs to be the biggest waste of time. The only thing that works is filing the DMCA and getting it removed. But you loose time to do these as they are time consuming and sometimes incomplete.
[edited by: tedster at 11:16 pm (utc) on Dec. 1, 2006]
What site is using extreme duplicate content and NOT being penalized? do a search for viagra lawyer in alaska/alabama/arizona and right on down the list for every state in the US cialis lawyer in alaska/alabama/arizona levitra lawyer in alaska/alabama/arizona search more drugs ad nauseum
I know the people in this thread are just trying to help, but isn't the guy writing SEO books for dummies a more reliable source?
So, is Google broke or not being totally honest?
Before anyone goes and spends the time to start changing their site, why not do what the SEO who wrote the book, (seo for dummies), does. He doesn't seem to think there is any duplicate content penalty.
What site is using extreme duplicate content and NOT being penalized? do a search for <keywords removed>.
I know the people in this thread are just trying to help, but isn't the guy writing SEO books for dummies a more reliable source?
So, is Google broke or not being totally honest?
<Sorry, no specific searches.
See Forum Charter [webmasterworld.com]>
[edited by: tedster at 5:59 pm (utc) on Dec. 11, 2006]
But in practice, there can be strong negative repurcussions to a site's rankings from duplicate urls forthesame content. Strictly speaking it is not a Google-imposed penalty, but it still can feel pretty awful.
For instance, and this is the more minor situation, when backlinks are pointing to different urls for the same content, then PR and backlink influence is being "split into different piles" instead of focused on one url.
But what can be even worse is when a site is using some technically unsound configuration that allows essentially an infinite number of urls for somebit of content -- a situation that can come with a "custom error page" that does not return a 404 response code, for instance. Or from using the URL to track user behavior in some way.
No, you do not get a "true" penalty in this situation, but you sure may find that googlebot doesn't get around to spidering all your real content. Or that you have so many URLs with weak PR that your site: search results are nearly solid Supplemental Results.
So I'm with you in one way -- it's not a true penalty. There's no black mark in the Google book against your site. But from the end result against your traffic, it might as well be a penalty, especially in the most exaggerated "duplicate URL" conditions.
I think one area of "duplicate content" which has not been emphasised enough in this thread is the amount of "similar content" that can appear through a site, which can cause the pages not to be adequately indexed [ ie Google knows about them ], but they may not be good enough to be served.
In our journey to get rid of supps, 5 of our sites have been returning good basic results for over 10 weeks. However, these pages are too heavy on content and will need to be reduced. These are really basic pages, but at least they work! All the first 1000 pages using the site:tool are nice and neat.
But our other "fancy" and highly functional site page templates which use a lot less content are not. Content was compromised for functionality in this strategy, which is proving to be a disadvantage. Here the site:tool suggests all the pages have been noticed, but it's littered with supps and the pages which we thought should show are not being served, or they are being sporadically served.
If anyone's fixed the meta titles, descriptions, architecture etc etc I'd suggest they look into their content and the theming/ architecture of the pages.
If they are perceived by G to be too similar G is still able to throw them out as supps or not show them at all.
It seems G's opinion and ability to be confused by these pages was less forgiving than i thought.
[edited by: Whitey at 3:04 am (utc) on Dec. 19, 2006]
Several things can happen from the same basic problem.
Then there are effects due to any possible errors in the logic used to present the serps such as ordering stability.
For a fun trip through things programatical look up stable vs unstable sorts.
With that I'm going to finish my ice cream and watch some tube.
For a fun trip through things programatical look up stable vs unstable sorts.
Bear - it was no fun 'cos I have little clue about programming. But i can see where you're coming from. Can you open this up a little .... I'm keen to see what you think about where the similarity of the problems are. It might be revealing.
Strictly speaking we're kinda moving towards a subtle differentiation away from "duplicate content" to "similar content" known broadly as "supplemental content", but with the common G response of a "filter" on undesired content outside of the logical path of the algorithm. ie wrong theme and similar content.
The fact that that this appears to be treated the same by the filter is interesting and i think you're on the cusp of better explaining how to deal with this. I think it will also reflect on aspects of the "on page" duplicate content problem [ IMO ].
btw - we had some strange results which i think came from poor theming on our part [ [webmasterworld.com...] - i think i'm answering my own question ] - which i hope we can fix quickly. In this case , the pages, which were irrelevant were served [ I'm not proud of it either ].
I'm hoping that you'll be back on the boards soon when you've finished your ice cream!
Adam Lasnik on Duplicate Content [webmasterworld.com]
Try some new topics raised also here at Google webmaster central :
Dealing with Duplicate Content [googlewebmastercentral.blogspot.com]
- Block appropriately.
- Use 301s.
- Be consistent.
- Minimize boilerplate repetition. - Wow - 10's of thousands of sites in this trap
Thanks Tedster for drawing our attention to this and Adam's inputs as well. A must read thread for all.
ie when you have too many of the same keyword/phrases repeated.
This may give an indication to the tolerances of Google's algo with regards to overall dupe content when linking to other sites . It's here on this thread:
Keyword Density [webmasterworld.com]
There's also a mention here of repetitous [ duplicate ] anchor text.
I've been seeing a lot of changes in the past few months that look like "heavy keyword use in anchor text" is a target
Again, i think it's a variation of the same theme.