Forum Moderators: Robert Charlton & goodroi
One of my biggest competitors cloned my index page onto two of his domains that previously had original content and have a lot of incoming links and PR4. (CLONE1 and CLONE2). My Site (ORIGINAL) has more or less unchanged content from the moment that it was cloned. ORIGINAL site has less inbound links but also PR4.
Now his Dupe site CLONE1 has taken my previous SERP position for main keywords, and my index page cannot be found for half of the top keywords it ranked for.
Now where it gets CONFUSING!
- for some keywords both My Original domain and CLONE1 shows up.
- if you check the Cached on Google SERPS page you see "Google's cache of CLONE1.com" BUT
+ if you enter CLONE1 and in Google Toolbar select [ Cached Snapshot of Page ] My site ORIGINAL is shown and the text reads "Google's cache of ORIGINAL" and I can see the CLONE1 url at the top and it is all on the same Google DC / IP ...
- It looks like Google does recognizes that the original content comes from my site, but still it has replaced many SERP results with the CLONE1
- if I search for the domain name CLONE1 dotCom or CLONE2 dotCom in Google my site comes up #1! followed by the CLONE at position #2 .
Now what should I do?!
1. Should I CHANGE my content to get back into Google SERPs and "decouple" from the CLONE sites?
2. Should I keep the content the same and try to persuade Google to consider me the ORIGINAL by adding more links to my site?
3. Would it do any good writing to Google (I pretty much expect a canned response from them)
... has anybody had similar experience - any advice for me ?
I have never bothered with any other SEs. Too much of a faff the small amount of traffic they send to the affected site.
For some KeyWords my site has returned to SERPs, but at a lower position with my previous spot taken by the clone... My site has had minor changes, so I don't know if that is the cause or the extra links to the site I have added.
Still the question remains:
- Should I risk it and keep content unchanged and let Google decide in my favor. (and penalize the CLONE)
- Or should I totally change the sites content to play it safe and give a free "gift of content" to my competition which outranks me with my content :(
what are the chances of Google banning them now?
(they are online with the CLONE site for about 3 weeks now)
Google DMCA [google.com]
If it is your content, and the clone sites have indeed made use of it, G' will remove them from the SERPs/Index.
This thread is only one of the many examples:
[webmasterworld.com...]
I can report that in MY case the Gods of Google decided, that My Original content was NOT original, and have now changed the CLONE1 and CLONE2 sites with their just slightly edited copy of my page to "ORIGINALS". Google Toolbar and SERP Cache shows [ This is Google's cache of CLONE ...] (before G was undecided the Toolbar Cache for CLONES showed [ This is Google's cache of ORIGINAL ...] and SERP cache link showed cache of CLONE.
The irony of this whole thing is that CLONE2 has Logo linking to MY index page and MY Affiliate links in there. CLONE1 has edited logo link out, but most other content is left unchanged including some Affiliate links with MY tracking code in them :)
>>> So what I did now is the only thing I could really do and changed MY PAGE :(
YES, I could have written DMCA complaints, but with clones on offshore servers much good will it do - and time it will take is not worth it.
the lesson for me (and whoever else is reading this) is:
-- if a site with potentially HIGHER PR and more incoming lings copies your content and you don't want to loose your SERP spots - YOU HAVE TO CHANGE the ORIGINAL content you had... (sux big time - I know)
-- can complain and try to resolve it later. Place the ORIGINAL on an robots forbidden folder etc. etc.
-- if you DO NOT change, you can end up like me loosing some SERP positions to the clone (my page has completely disappeared from serps for some keywords / replaced by clone), just because they outranked you by a little bit or real PR (toolbar PR is the same)
Goes against the logic, but after losses I had during the hottest time of the year if something like this repeats I don't think I will trust in Google to sort it out and "identify the original".
All they identified was who has more incoming links and higher rank.
I will post here if after the changed page is indexed it returns to SERPs and how it does compared to CLONEs
If, for example, they are sales sites operating from a criminal services host then it's likely that visitors will lose money. If it's likely to be repeat business then it's possible you will lose business because they won't trust YOU as well as the clones.
Just thinking. :(
Just another reason for Google to release a <reg> tag that registers unique content with Google Webmaster Tools, and keeps copies of the original domain owner, even in scenarios where the page itself was not kept in the index because of lack of link juice, etc.
While I too wish there were something like a <reg> tag, I think Google feels that there are many possible loopholes... and it's something that could be easily taken advantage of. Not all webmasters would know about it nor have WMT, e.g.
Also, there's lots of material that's not on the web which spammers using this tag might effectively claim as original... in a sense complicating copyright.
That said, I'm in no way happy with the current situation.
To: dstiles
They use the page to display some contextual ads and advertise with POP-ups. I don't think they can do too much damage with that. They do not collect Credit Card info or anything like that.
Just to sum up my experience:
- Google got it WRONG when determining which site had the ORIGINAL content.
- As soon as MY SITE got new content and it was indexed it returned to serps, so Google had placed a duplicate penalty on the ORIGINAL site and promoted the CLONE to its spot.
(I thought they had a DB like Archive.org to check the original but apparently their algo is "smarter" - NOT!)
>>> Google DOES EVIL on Christmas !
>>> If in doubt if Your site (PRx) or Clone (PRx) will win in serps => CHANGE Your Site
Personally, I would have left my home page looking the same as it always had and would have gone the DMCA route to resolve the problem. You might have gotten your page back in the SERPs, but the clone is still their using YOUR content to beat you.
There is no such thing as a "duplicate content penalty". Penalties prevent pages from ranking well... but duplicate content OFTEN ranks on page 1... even position 1. Duplicate content FREQUENTLY outranks original content. They pick the copy (original or duplicate) that based on various signals appears to rank better for the keyword phrase, and typically eliminate all other copies (whether duplicate or original) from the SERPs.
Which do you think Google is going to show in the SERPs? An original post from a blog that ONLY has internal linking from pages on that same site pointing to it? or a duplicate copy of the post on another blog that has hundreds of external inbound links or possibly a few inbound links from highly authoritative sites?
Google does not have ESP. They can't know for sure what is original and what is a copy. They have to go on signals that they see from across the web.
Pages flagged as duplicate content ranks... It is just a little harder to get to rank than the original copy.
As soon as MY SITE got new content and it was indexed it returned to serps, so Google had placed a duplicate penalty on the ORIGINAL site and promoted the CLONE to its spot.
I'm afraid that this will keep happening to your site. You're taking evasive actions: the aggressor wins. File that DMCA with Google and keep adding worthwhile *new* content to your site.
if you check the Cached on Google SERPS page you see "Google's cache of CLONE1.com" BUT + if you enter CLONE1 and in Google Toolbar select [ Cached Snapshot of Page ] My site ORIGINAL is shown and the text reads "Google's cache of ORIGINAL" and I can see the CLONE1 url at the top and it is all on the same Google DC / IP ...
I'd implement a NoArchive directive across the entire site. Force Google to crawl your site and not rely on cache. That should thwart the cloning strategies. Just a guess. ;)
Personally I feel Google (or any SE) Cache is the arch nemesis of SEO.
I'd also block IA (Archive.org) from keeping historical copies of the site.
Just another reason for Google to release a <reg> tag that registers unique content with Google Webmaster Tools, and keeps copies of the original domain owner, even in scenarios where the page itself was not kept in the index because of lack of link juice, etc.
I suggested to MC a couple of years ago that the first site using SiteMaps to register a page be considered the original owner and outrank any copies considering that page didn't occur anywhere else.
My logic for making sure competitors didn't steal the page and rank first was to use a simple concept that you only publish the page AFTER the Googlebot crawled it, so in that way it was secure and only known to your site and sitemaps until after the crawl.
Wouldn't Archive.org be an easy way to prove who's content was original and not cloned?
Not really because they don't crawl fast enough and the cloned content could easily end up in the archive before the actual content.
I've blocked them 4 years now, good riddance.
I suggested to MC a couple of years ago that the first site using SiteMaps to register a page be considered the original owner and outrank any copies considering that page didn't occur anywhere else.
It might be a good idea... if all owners knew about it and used Google WMT. Otherwise, as I noted above regarding the suggestion of a Google <reg> tag, you'd have a situation that would be complicating copyright.
Additionally, in large organizations or on sites with lots of fresh content (newspapers, eg), it's perhaps not practical to update SiteMaps as fast as new content is added. Everyone would have to develop CMS modules to update SiteMaps at the moment of publishing to the web.
[edited by: Robert_Charlton at 9:23 pm (utc) on Dec. 26, 2009]
>>> Google DOES EVIL on Christmas !
I am a little confused here. What action you should take - file a DMCA with Google - was suggested December 11. Now it's December 26th you still haven't filed that DMCA complaint and are instead complaining about Google ruining your Christmas.
If you had filed your DMCA immediately the clone would have been removed already.
So file a DMCA complaint. At least if its really your orignial content and you are not running a clone yourself and the clone is a clone of a clone.
Additionally, in large organizations or on sites with lots of fresh content (newspapers, eg), it's perhaps not practical to update SiteMaps as fast as new content is added. Everyone would have to develop CMS modules to update SiteMaps before publishing to the web.
Pinging sitemaps and waiting for the bot to respond as a simple closed loop is trivial and easy to implement, esp. in CMS systems.
The process wouldn't complicate copyright at all because copyright has nothing to do with first to publish and first to index, especially in this day of RSS feeds and licensed content.
For instance, the issue you try to stop is that the RSS feeds wouldn't replublish the data prior to a crawl, which is how a more popular site ends up being credited with your content in the first place as they get crawled more often than you do so their republishing of your RSS feed comes back to bite you which deferring the new content to the feed may fix.
Anyway, the OT really needs the DMCA route and perhaps try a canonicalization link in their version of the content in the short term.
Like their algo picking Clone content as the more important one and the one to display in SERPs. I would call it evil -- and I bet it happened to hundreds of webmaster on Christmas, like it happened to me during this month (they were undecided for quite some time)
DMCA - I am outside of USA and the Clone sites are offshore too. I do not want to give the competitor my name and address plus contacts. If somebody has any advice on how I can do that - send a DMCA notice to Google without getting my name forwarded to my competitor? (and to keep the costs down - hiring an US lawyer is not an option for me)
Personally, I would have left my home page looking the same as it always had and would have gone the DMCA route to resolve the problem.
Copyright law is a federal law. Be prepared to spend 30-50k just to file your lawsuit in a federal court.
So as Matt Cutts told me in person at Pubcon a few years ago... "Just ignore them and focus on your site. Google is pretty good at determining the original vs the clone."
I believed him but I realize you seem to be an exception. Here is my advice. Find out how they are outranking you. Most likely some black hat tactics are being used. Document it and report them to the G spam team.
Good luck. I know how frustrating it must be for you. Very sorry to hear you are going through all his during the holidays.
Get a lawyer to do it.
But the DMCA is a two-edged sword. Claiming a copyright you don't own is as evil as publishing someone else's material -- and the penalties are as great. And that's the way it ought to be. Nobody can send a DMCA notice to YOU without becoming subject to your countercharge. (Which means, you have to know who to countercharge at ... and if you send a notice to someone else, THEY have to know how to countercharge you.)