| 2:03 am on Aug 26, 2011 (gmt 0)|
I have to agree with huskypup, if you're needing to start over do it offline in some real world venture far away from anything search engine run.
In case you haven't noticed the internet has made search engines extremely rich and wall st demands they continue. When Panda rolled out Google's own traffic increased considerably according to many metrics sites like compete.com.
THAT is sure to continue as the long arm of google reaches into more places than in does now, ultimately considering itself to BE the internet making most other sites obsolete. Guarantee they'll try, do not depend on them anymore.
| 2:43 am on Aug 26, 2011 (gmt 0)|
@wheel - to clarify the second time my site was "hack proof" - the second time my content was stolen, so "hacked" isn't the proper term for the second time.
The shocking part for me is how blatently obvious the content theft is and how Google has been very uncooperative. I've sent DMCA's and other complaints. They took the content word for word (actually my ENTIRE site) and put it on about 20 Blogspot accounts. You can imagine what that does to your rankings :P
I like the idea of the C&D letters instead of DMCA's. Time to go nuclear on these scumbags.
| 2:47 am on Aug 26, 2011 (gmt 0)|
@driller41 sometimes I have to laugh at the responses and reread what I originally posted to make sure I haven't gone insane. To clarify, I have had success since 1995. I even made it in the Wall Street Journal on one occasion. But anyone that has been on the Internet long enough knows how volatile it is... and I'd guess most people here have had a venture or two go up in flames as quickly as it skyrocketed to success.
I think? :)
| 5:38 am on Aug 26, 2011 (gmt 0)|
Did Google not act on the DMCA because they got a counter-notification? Then its time to sue. The problem will be enforcing a judgement, but you may be able to do something like seizing their adsense earnings from the plagiarised content, or control of the blogs with the content. Have you asked a lawyer?
| 10:49 am on Aug 26, 2011 (gmt 0)|
Hack Proof? No such thing exists
It's also a no win situation - sue or C&D someone from China? BWAHAHAHA
Lots of things you can do to limit your exposure to copying but it;s a lot of work
Even if you block China they can still get your site via a proxy, SE cache, Archive.org, and most likely your own feed being distributed
DMCA to Google is your best bet, I'd make some noise to find out why that isn't working
| 1:56 pm on Aug 26, 2011 (gmt 0)|
|I run some content sites and we spend 20+ hours a week on DMCAs. |
If you have good content, getting copied is an everyday event. We’re not being hacked. Just copied. Many of the sites that copy us don't understand the concept of intellectual property. They're small blog sites who just want to share great content. We work with these folks so that they can keep a bit and provide us quality links.
About 4 times a year we find 1,000 pages wholesale copied onto another domain. Funny thing is that they don’t even remove our canonical tags that point back to the original content. LOL. We do use a service to identify the plagiarism.
I'm genuinely happy with the DMCA process. Most all hosts comply with the requests. Google, Yahoo and MSN do a pretty good job of responding – thought it takes time. We tend to serve the domain owner, host, and search engines DMCA notices at the same time. The next round is usually a C&D letter (boilerplate).
It would be difficult to outsource the process because our paper files (yes, an old fashioned file folder) contains the original research, drafts with handwritten notes, original publication date and many of the edits on everygreen content pages. Sometimes we need PDFs of the original content to support the DMCA notice. We run into situations where writers for other sites copy us and the site own challenges our claim. A PDF of the original draft... always does the trick. We've gotten many a freelance editor fired.
So it would be difficult to outsource DMCAs. This is just a cost of running a content site.
| 2:48 pm on Aug 26, 2011 (gmt 0)|
What I don't get is I thought Google was super advanced and could detect the original source of content? In my case the sickening part is the Chinese spam Blogspot accounts outranked me with my own content!
I'm thinking... the only way to win is to do the same thing that the Chinese are doing! :)
[edited by: phranque at 1:18 pm (utc) on Aug 27, 2011]
| 3:27 pm on Aug 26, 2011 (gmt 0)|
Uhh . . . the what? :/
Lost everything . . . not just on the Internet, and the epiphany is you never owned what you lost, it owned you, and once it's gone you are free. It's a message to figure out what you're doing wrong on the most basic levels (see previous question) and an opportunity to get it right.
| 7:04 pm on Aug 26, 2011 (gmt 0)|
@rocknbil - very deep ;)
| 10:27 pm on Aug 26, 2011 (gmt 0)|
Check this out... EVIL Google put a form up where you can report scraper sites...
| 10:42 pm on Aug 26, 2011 (gmt 0)|
Interesting, thanks for that.
However, I gotta tell ya, I still like DMCA's. I've had a couple of these folks have their hosting accounts pulled over a DMCA. That beats anything Google can do.
I won't submit, but it might be a good thing to give these folks data, let them build an algo that prevents the scrapers from benefiting.
| 10:50 pm on Aug 26, 2011 (gmt 0)|
@wheel - that's gotta be satisfying :)
I'm submitting the scraper sites I know about... there are at least 30 of them. Probably more by now
| 10:50 pm on Aug 26, 2011 (gmt 0)|
|URL of specific scraper page: (Required)* |
Anyone ( apart from Google ) have a list to hand of about half the URLs of the pages in Blogger..
They could just run their "algo" over their own properties ..and those running adsense ..to begin with ..and wipe out 90% of the scraper problem, without any form filling needed.
| 11:05 pm on Aug 26, 2011 (gmt 0)|
|EVIL Google put a form up where you can report scraper sites... |
just to be clear, this form is not really for "reporting" scraper site.
it is specifically to supply test data.
|Google is testing algorithmic changes for scraper sites (especially blog scrapers). We are asking for examples, and may use data you submit to test and improve our algorithms. |
This form does not perform a spam report or notice of copyright infringement.
| 11:13 pm on Aug 26, 2011 (gmt 0)|
Also to be clear it is from a legit Google source - that slacker Matt Cutts. Amazing how long it has taken Google to figure out that scrapers are a problem. WTF
| 2:24 am on Aug 27, 2011 (gmt 0)|
Im sure this has been said already but -
Good content will always be copied or stolen. Always. If your business model (if you have one) relies on this not happening then your not going to be a happy bunny.
You need to be invested in what you do but detached enough to see when your fighting a losing battle or possibly making excuses for failure.
| 3:19 am on Aug 27, 2011 (gmt 0)|
@woofwoof - what business model would you suggest?
| 8:49 am on Aug 27, 2011 (gmt 0)|
The commonest form of copying from my site is on sites like scribd and slideshare. People also copy and paste into forum posts.
If anyone deserves to be prosecuted for facilitating copyright infringement, its Scribd - its full of plagiarised stuff.
Its interesting that some people keep getting hit by scrapers, and others are never affected. Certain niches, types of sites?
| 6:32 am on Aug 30, 2011 (gmt 0)|
@graeme_p - I've noticed the same thing... some of these guys have been blogging for years and it's as if they are "blessed" and the rest of us are cursed. It would be interesting to know what kind of content theft and hacking issues high profile blogs like Boing Boing, Problogger, ICanHasCheezburger, etc. have.
I have one blog that is on a wide variety of topics and doesn't seem to attract criminals. The ones that have seem to be on a narrow niche subject.
| 12:07 pm on Aug 30, 2011 (gmt 0)|
I have a niche site that has been scraped very often (including a a complete copy is Russia), but I have never seen a scraper out rank me.
High profile sites are probably too strong to be easily out ranked by a scraper, but I think some more modest sites (like mine) escape because Google somehow manages to identify the source - we may be doing something right (but I do not know what), or we may be just lucky, or (most likely) the competent scrapers target certain niches.
It makes sense that a competent scraper (one good enough to outrank a reasonable original site) will target certain niches, where they can make money fast enough to make a reasonable profit before they are taken down.
| 3:03 am on Sep 1, 2011 (gmt 0)|
To correct a comment made earlier - blocking China will not block Australia unless one takes shortcuts and blocks at the A class. On one site I use ZBBLOCK to block various countries, not for any serious concerns, but to see if it makes any difference.
For one content site we use a backup cron task to mail the backup to a Gmail account. Saved bacon once.
| 5:39 am on Sep 2, 2011 (gmt 0)|
Sounds like a lot of good learning from this thread for you in a short amount of time.
If you've been to Wall Street Journal once, the second time will be even sweeter.
Never count on Google to do anything for you. In the end they could care less about your website (or mine for that matter)
You need to be proactive about your online presence. Snoop often. File complaints when you can.
Always have a direction. Staying in one place often leads to being picked off on the internet.
| This 52 message thread spans 2 pages: < < 52 ( 1  ) |