|Help with content scraping websites.|
I often find external websites that scrape our complete website (content, template, nav everything) and host it somewhere by just replacing the brand name with something else. They are not actual websites but a photo copy of ours.
I am much worried about the duplicate issues.
How do I stop them and what kind of actions one can take against them?
Thank you so much!
It is a sign of success. I get worried when I don't see alot of other websites scraping me. Most of the time these scrapers have little to no impact on me because my ranking signals are much stronger than theirs. Make sure that Google can find the content published on your website first before it gets scraped. If you are really worried you can start filing DMCA requests.
Copyscape helped me to find these type of websites. Still no clue what their intention is, they even have demo, buy now links in it without linking to it.
If copyscape can find it as duplicate can Google right? That's what I was worried of. Anyway our site has strong domain authority and page authority being a well known brand in its niche.
I get scraped a lot, and by some pretty high authority sites too. I have so far never found one outranking me, though.
Scrapers gonna scrape. There's not a lot you can do about it. You'll want to keep working on your authority signals - making it eminently clear why you're the authority in your niche, and like Goodroi says, make sure Google finds it before anyone else does.
Don't forget Webmaster Tools, "crawl", "fetch as Google", "Submit" as a means to submit your page(s) to the index the moment they're published. Helping Google establish you as the owner.
Also using Google's authorship may help:
It's one more thing a scraper will probably want to remove.
Using absolute links can be helpful to a limited extent.
Also put a very unique string on every page.
For example, XCVBARTAV, choose this carefully so there are no hidden meanings, words, foreign words that might have negative connotations. Then just search for the string to find the thousands of sites copying your content.
I have one a unique string in every title. I just love how sites copy titles, file paths, and descriptions. Many analysis sites are getting excellent rankings because they have copied my titles! At least this way I can find them.
I don't tend to get scraped sitewise because my sites are personalised to the user's location. Not all are willing to enter their location but most do. Without the server side code that delivers content based on the user location, the scrapers have only half the site functionality at most and that renders their activities a bit useless.
Any code of that type will do the job, it doesn't need to be location based, user preference based will do the same.
Thanks for the suggestions all!
There are several ways through which you can detect a scraper like through GWT, CopyScape, Trackbacks, Cookie Enforcement, Anti scraping services etc. You may read the full guide here: [scrapesentry.com...]
Thanks for answering @peterdavidson99
I was only looking for ways to stop such things happening or even if it happens how to make sure that we are not getting penalized by Google for the scrapers work.