Content theft -- how could this happen?

Forum Moderators: not2easy

Message Too Old, No Replies

Content theft -- how could this happen?

stormshield

12:35 pm on Jul 5, 2006 (gmt 0)

Hi all,
The other day I came a cross a post in which someone complained that someone else had stolen almost every of his article. Then I checked that and , indeed, the articles were identical on that site. So I thought "Well, maybe he stole those articles but they will be treated as duplicate content".

So I did a Google search containing a bit of text of some article. And guess what? It turned out that articles on the stealer's site are the only ones indexed.

Do you have any idea how this could happen?

stapel

12:40 pm on Jul 5, 2006 (gmt 0)

The Googlebot "found" the allegedly-stolen site first, or hasn't "found" the other site at all.

The Googlebot just follows links. It does not establish precedence nor determine authorship or ownership.

Eliz.

stormshield

1:03 pm on Jul 5, 2006 (gmt 0)

Thanks Eliz,

What can a webmaster do about such a situation? Is there any way to prove your ownership?

stapel

3:12 pm on Jul 5, 2006 (gmt 0)

A registered copyright is a good start, and many hosts will accept a copy of one's registration certificate as proof of ownership. (I have yet to encounter a site-scraper who went to the trouble and expense of registering, so there has never been a counter-filing.) Of course, a court of law would be the final arbiter of ownership, but one hopes that things never get that far.

Other avenues would be The Internet Archive, assuming one is listed. (You can submit yourself to be spidered and included, but you should allow at least a year for your site to show up, if it isn't listed already.)

If the case lands in court, then old copies of documents could be helpful. ("See? I was working on this two years before he even registered his domain name.")

Formatting has sometimes been helpful. I've had people who scraped my site and claimed ownership of my work. But, funny thing, the stuff they scraped from my site matched the formatting of my site, not theirs. And the stuff they scraped from Site A, Site B, Site C, etc, matched the formatting of those respective sites. Either there was a "vast right-wing conspiracy" to pick on the guy, or else he'd scraped from a dozen sites and hadn't bothered to reformat at all. (Contacting those other victims can be helpful, too, by the way.)

You can also hide your information in your pages. My pages have white backgrounds. I hide my copyright notice in white text, so it's invisible. My graphics have transparent backgrounds with my copyright notice in white, so that also is invisible. You'd be amazed how many site-scrapers don't even notice that my copyright notice is right there, bigger than life, on their non-white-background pages.

But registering your copyright is probably the most-secure method of "proof".

Eliz.

econtent

2:55 am on Jul 7, 2006 (gmt 0)

I'm sorry that happened to you. Is the "offender" in the same country or a different one? You can 1) politely contact them and tell them you own the content. It could be that they hired someone to write content for them, and the writer stole the content. If this is the case they should be more than willing to take the content down. 2) Send a simple cease and desist letter stating that you own the content and they are violating your copyright. 3) copyright your information.
Either way I would "rework" your content until the situation is resolved so you don't lose your ranking.

Good luck!
Tina

Hello.

PS...here are sites with content to help you in the interim.

EzineArticles.com (free)
BuyContentOnline.com (free and fee-original and prewritten)
Constant-Content.com (free and fee)
Go Articles (free)
Elance.com and GURU.com (project boards with article writers)
AssociatedContent (free and fee--RSS feeds)

netchicken1

3:11 am on Jul 7, 2006 (gmt 0)

Couldn't the googlebot easily establish age from the age of the page or when that page was first indexed in google?

I am probably wrong, but wasn't there something along those lines about hiding duplicate content in the "Big daddy" change recently?

stapel

4:11 pm on Jul 7, 2006 (gmt 0)

The "age" of a page would, presumably, be the server date on the file. So say you upload your original article on 01 January, and the plagiariser scrapes it and posts it to his own site on 01 February. If you then, say, correct a typo on the page on 01 March, your original content could appear to be "younger" than the infringing copy.

The Googlebot, to my understanding, just spiders pages. It doesn't archive sites in perpetuity. One relies on The Internet Archive and its Wayback Machine for that.

If the infringer's site happens to be spidered before the original site, then relying on the Googlebot's "spidered on" date would result in the infringer having claim to the author's original content -- presumably not a result of which the author would be in favor.

Eliz.