homepage Welcome to WebmasterWorld Guest from 54.237.213.31
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 37 message thread spans 2 pages: 37 ( [1] 2 > >     
Stop Over-Analyzing Scraped Duplicate Content
Fern



 
Msg#: 4312293 posted 7:17 am on May 14, 2011 (gmt 0)

I'm starting to get to a point where I think people are talking to much about scrapers stealing their content.

Seriously, are scrapers (dup content) killing you?

Scenario:

1)Yes, they are killing your website because you don't know how to do on page optimization.
a) If a scraper with a backlink profile weaker than yours can really outrank you, then that's your fault. Blame yourself on that one.

I have a site that has THIN, THIN CONTENT. It ranks PERFECT. I have SPAM SITES and I have Quality Sites that PEOPLE LOVE.... yes, I mingle with darth vader with some properties. None of my sites have been hit by panda.

If you really think duplicate content is your problem, it means your site is an ugly ducking when it comes to optimization.

Wikipedia stays on top for it's content: "Ethanol fermentation, the production of ethanol for use in food, alcoholic beverage, fuel and industry"

Think long and hard

 

netmeg

WebmasterWorld Senior Member netmeg us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4312293 posted 2:03 pm on May 14, 2011 (gmt 0)

I don't entirely disagree with this.

dickbaker

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4312293 posted 2:43 pm on May 14, 2011 (gmt 0)

I have a site that has THIN, THIN CONTENT. It ranks PERFECT. I have SPAM SITES and I have Quality Sites that PEOPLE LOVE.... yes, I mingle with darth vader with some properties. None of my sites have been hit by panda.


Another new member untouched (for now) by Panda, dropping by to tell the rest of us what Panda does and does not punish. This despite the fact that the most experienced SEO's still are just theorizing about Panda.

Andem

5+ Year Member



 
Msg#: 4312293 posted 2:51 pm on May 14, 2011 (gmt 0)

>> If a scraper with a backlink profile weaker than yours can really outrank you, then that's your fault.

Scrapers often use underhanded techniques to game pagerank, let alone on-site tactics. My site and others are/were victims of relentless spamming by automated programs like xrumer [webmasterworld.com...] Scrapers ranking above original content aren't even buying links anymore, they're stealing them.

aristotle

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



 
Msg#: 4312293 posted 3:02 pm on May 14, 2011 (gmt 0)

Maybe some sites don't have a problem with scrapers because their content is so worthless that nobody wants to scrape it.

ErnestHemingway



 
Msg#: 4312293 posted 3:26 pm on May 14, 2011 (gmt 0)

Fern, Can't agree there my man. Been here since 1996 ins and outs. I never show my frustration unless it is at this level which happened during Panda.

chrisv1963

5+ Year Member



 
Msg#: 4312293 posted 3:48 pm on May 14, 2011 (gmt 0)

None of my sites have been hit by panda.


Facts:

1. It are mostly websites with good content that was copied and scraped over and over again that have been hit by Panda.

2. Junk sites have not been hit by Panda. That's why there's so much junk in the serps now.

tristanperry



 
Msg#: 4312293 posted 3:50 pm on May 14, 2011 (gmt 0)

I agree overall to the sort of gist of your post. Having said that, I still track down content scrapers (even though they've never outranked me) simply because it's my content and I'll protect myself from theft.

Regarding 1)a) though - I wouldn't agree with this. If you write great content and don't focus at all on backlink building, this is essentially what Google preach. (Granted, if you have great content you'll pick up natural links). However if someone scrapes your content then automatically gets tonnes of backlinks and possibly outranks you, it's not really your fault; it's the scrapers fault.

So yep, I'd agree overall; even though I disagree that Panda is a triviality and that being outranked by scrapers is a black-or-white issue.

brotherhood of LAN

WebmasterWorld Administrator brotherhood_of_lan us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4312293 posted 4:09 pm on May 14, 2011 (gmt 0)

Agreed with the salient point that 'scrapers stole my content' seems to be a very frequent thread.

With some technical know-how, you can hide your content until you are sure the likes of Google and Bing have spidered your 'great and unique content' and then make it available to the web at large (ala cloaking). Backlinks and on-page optimization can re-affirm your new contribution to the net.

Google may be broke if it can't then determine you were the first to produce the content. I doubt this is the case though.

There are a myriad of spiders out there randomly looking to take content/make snippets of content/find e-mail addresses etc, with some care (Incredibill's white listing of bots for example) you can minimise your chances of having your content stolen. If your site is being specifically targeted then a banning is in order.

If your content is valuable enough to visitors and you alike, the problem should really implode on itself, though the above should help minimise any potential 'issues'.

indyank

WebmasterWorld Senior Member



 
Msg#: 4312293 posted 4:17 pm on May 14, 2011 (gmt 0)

1)Yes, they are killing your website because you don't know how to do on page optimization.


I am all ears to you. Has anything changed w.r.t on page optimization after panda? I guess most people complaining are those who saw their pages ranking well before panda. Has anything changed since Feb 24 or did you make any change to adjust to panda?

I am all ears to people who really want to help by sharing their experiences with some specific details rather than being generic.

indyank

WebmasterWorld Senior Member



 
Msg#: 4312293 posted 4:25 pm on May 14, 2011 (gmt 0)

of-course you don't have to reveal sites but you can simulate a few examples.

If you feel that you don't have to share such details, then I am sorry for bothering you.But then there is no point of having a thread in a webmasters forum, as people come here for help or to learn by listening to others experiences.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4312293 posted 4:36 pm on May 14, 2011 (gmt 0)

Has anything changed w.r.t on page optimization after panda?

I'd say yes. Factors like scan-ability and read-ability, typography, aesthetics and so on now matter, because for Google, the user's experience of quaity is now part of the ranking formula.

I appreciate Fern's opening post very much. He didn't say stop analyzing the scraper issue, he said to stop OVER-analyzing it. The fact that he has been scraped but not Pandalyzed, and that's across a variety of websites, is an important data point.

Yes, Panda apparently has a significant "unique content" factor - but that cannot be the whole of it by a long way. So I agree, do not over-analyze the scraper factor.

indyank

WebmasterWorld Senior Member



 
Msg#: 4312293 posted 5:08 pm on May 14, 2011 (gmt 0)

Thanks Tedster.

I think I know what you mean by scan-ability and read-ability and I don't see any faults with the sites I manage on that front.

Typography and aesthetics is a good topic to discuss as I guess we haven't focused much on that in this forum. If I am not wrong, We were talking about design but not much on typography.

Are there any web standards for Typography and aesthetics? Has the one or two sites that recovered did anything on this front?

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4312293 posted 5:41 pm on May 14, 2011 (gmt 0)

I don't know of a Panda rebound that made such changes, but I certainly know of sites that greatly improved their user stats and lowered their bounce rate by improving the typography - especially making sure that lines of type were not overly long and that line-height was adequate for easy reading. Right or wrong, that all makes for a heightened perception of quality.

There's some good research on this topic at the Software Usability Research Laboratory of Wichita State University [surl.org].

martinibuster

WebmasterWorld Administrator martinibuster us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4312293 posted 5:50 pm on May 14, 2011 (gmt 0)

Seriously, are scrapers (dup content) killing you?


Not 100% certain about scrapers who take a portion of content. However I have no doubt that those who take entire pages at a time absolutely harm my rankings, thus causing me to lose money. I know this for a fact because my rankings improve after those thieves take down my content. This is something that I have experienced for many years.

londrum

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4312293 posted 6:15 pm on May 14, 2011 (gmt 0)

scraping is one thing, but what are we supposed to do about rss feeds?
as far as google is concerned there cant be much of a difference between a scraper taking your stuff and someone reproducing your feed. and there's only so much information that you can take away from it before it becomes useless.

and even if you've only got a two sentence excerpt in it, you could still end up with it duplicated 100 times over the web within an hour of publishing it.

people fire off DMCAs at the first sight of a scraper, but dont give much thought to feeds. are we better off without feeds? its difficult to know after panda.

Sgt_Kickaxe

WebmasterWorld Senior Member sgt_kickaxe us a WebmasterWorld Top Contributor of All Time



 
Msg#: 4312293 posted 6:56 pm on May 14, 2011 (gmt 0)

Scrapers are dubbed lazy however they do spend an extraordinary amount of energy to be able to be lazy.

A simple trick I picked up that puts scrapers to work for me was a change in my rss feeds that embeds a link to my site as well as my site written out in text (www.example.com) from the start of every post. Scrapers dislike my feed because of this but some still use it and Google does indeed count some of the incoming links as backlinks.

Play_Bach

WebmasterWorld Senior Member play_bach us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4312293 posted 6:57 pm on May 14, 2011 (gmt 0)

> are we better off without feeds?

@londrom
I've gone back and forth on this. As I recall, JenSense advocated RSS years ago here as a good thing and so I happily followed her lead. Now I'm not so sure.

Planet13

WebmasterWorld Senior Member planet13 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4312293 posted 7:13 pm on May 14, 2011 (gmt 0)

I would like to follow up on the statements made by Fern, the original poster:


...they are killing your website because you don't know how to do on page optimization.


Has anyone successfully regained their ranking from scraper sites improving their on-page optimization?


If a scraper with a backlink profile weaker than yours can really outrank you, then that's your fault.


Has anyone successfully regained their ranking from scraper by improving their backlink profile?

Please note, these are NOT rhetorical questions (and I apologize if they might seem like that). I would truly like to know if anyone has successful regained lost positions from scrapers through either on page optimization or by link building.

londrum

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4312293 posted 8:22 pm on May 14, 2011 (gmt 0)

A simple trick I picked up that puts scrapers to work for me was a change in my rss feeds that embeds a link to my site as well as my site written out in text (www.example.com) from the start of every post. Scrapers dislike my feed because of this but some still use it and Google does indeed count some of the incoming links as backlinks.


this always worked in the past, but im not so sure that it does now. because the link will always be alongside duplicate content (unless the words on your feed are 100% different to your site, which is unlikely).
google doesnt seem to be able to tell (or care) where the info came from in the first place because the scrapers/rss feeders are outranking us.
so im thinking that we need to get rid of ALL this duplicate stuff, even the feeds

scooterdude



 
Msg#: 4312293 posted 8:25 pm on May 14, 2011 (gmt 0)

actually,

The OP has NOT stated that his site have been scrapped and survived,

THE op has simply stated that

OPs sites are unaffected by panda

OP has very thin sites

OP has quality sites

OP mentions a star wars character with a particular reputation

OP suggests that optimisation is key

OK;

My 2 cents

###########

[edited by: scooterdude at 9:20 pm (utc) on May 14, 2011]

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4312293 posted 8:37 pm on May 14, 2011 (gmt 0)

First, I want to get a little bit nit-picky about language, and then explain my nuanced "yes" answer. No site is exactly going to "regain" their former situation, and that's because Panda is integrated into the algorithm. It's not going away.

So the situation is that a site might see traffic levels that approach or pass their former levels, because the Panda-part now scores their quality as good. It's not impossible (theoretically) that a site could get the Panda boost that was introduced in Panda 2.

Good backlinks were one of the factors that a site I mentioned earlier achieved. And I don't mean a couple of runs with xrumer, either. They shifted the layout so that content was obvious right when the page loaded, they eliminated some questionable UGC, and they reached out to other sites in various ways (including social media) and let them know about good new content - and that got them some new, stronger backlinks.

They are a heavily scraped site. Not only that, but their UGC had (and probably still has) some copy/paste duplicate content from other sites. And with Panda 2.1. their total traffic is now right around pre-Panda levels, after dropping by about 40%. Alexa, Compete and similar sources of competitive data do not currently show that traffic rebound, but their own server logs so, and so do their ad impressions.

The traffic increase has a different configuration or shape than their pre-Panda traffic did, but it's real traffic, real people.

Did the new backlinks "cause" the improvement? Cause and effect are hard to pin down when you make diverse changes all at once. But the new backlinks are a real part of the picture in this case of a real recovery of traffic.

Tallon

5+ Year Member



 
Msg#: 4312293 posted 11:44 pm on May 14, 2011 (gmt 0)

In my niche there are so many anomalies that I'm now wondering if only a portion of the web is being rated by Panda, this could explain why we see so many scrapers outranking "real" sites.

What I'm seeing ranking fine:

--a site with 90% (at least) thin content (image and one or two sentences per page). Site has thousands of pages.
--sites with 10+ ads per page
--sites with only the title above the fold (the rest are ads and theme elements)
--scrapers (no original content)

My main, biggest site was hit by Panda 2.0, my other smaller sites (most super thin, even thin affiliate sites) have either risen or remained flat throughout the pandas. This doesn't make sense (my main site has the primo backlinks, completely organic, social activity, yada yada).

What I'm wondering is if it's possible that once a site reaches a certain threshold, panda is applied, otherwise a site stays under the radar? I don't think site size has anything to do with it, it's more about traffic amount. Something along the lines of:

--Once google sends a specific amount of traffic to a site, panda evaluation occurs.

--Or, if a site's overall traffic has more than x% from Google, panda evaluation occurs.

--(added thru edit) If a % of a site's content is ranking for high volume keywords, panda evaluation occurs.

Thoughts?

[edited by: Tallon at 11:58 pm (utc) on May 14, 2011]

Swanson

10+ Year Member



 
Msg#: 4312293 posted 11:49 pm on May 14, 2011 (gmt 0)

Weak backlink profile = problem
Weak on-page content = problem

It isn't now really up for debate - you need both to make sure you are fine in any Google update.

However, the main problem is going to be with big sites and ecommerce sites with quite a lot of dupe/weak and thin content pages. Smaller sites and blogs should be fine.

It seems the "Panda" portion of the algo is calculated sporadically - which means if you are caught in it now then you won't get out for quite a while so any changes you make to your site are wasted.

All of you guys with one site or a panda affected site need to start creating new sites and fast. Forget "good practice" you really need to not be a victim and start treating Google like a game that you are going to win. Create loads of sites - even scrape your own content on new domains, try anything.

Swanson

10+ Year Member



 
Msg#: 4312293 posted 11:53 pm on May 14, 2011 (gmt 0)

Start thinking of Google as an enemy rather than an ally and you will be fine.

The gloves really are off now and anyone that continues to "play by the rules" is going to be posting on these threads month in month out moaning about their drop in traffic.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4312293 posted 3:12 pm on May 15, 2011 (gmt 0)

I never think of Google as either enemy or ally. I think of the whole playing field as a "competitive-cooperative" game. Too much focus on either aspect of the relationship eventually leads to a non-productive situation.

---------

There are different kinds of websites that were hit by Panda. Some are actual branded businesses - still small and not really household names for most people, but with enough of an established customer base that a regular flood of new domains may not be a feasible approach.

What makes sense for an affiliate marketer might not make sense for every member here - and we really do have a wide variety of members: in-house SEOs, consultants, hands-on contractors, business owners, affiliate marketers, passionate hobbyists, scholars, etc, etc.

suggy

10+ Year Member



 
Msg#: 4312293 posted 4:22 pm on May 15, 2011 (gmt 0)

"I never think of Google as either enemy or ally"

I 2nd that tedster. Google is benign. It doesn't hate you or love you. It doesn't even know you exist!

"It's the algorithm, stupid!" !

diberry

WebmasterWorld Senior Member



 
Msg#: 4312293 posted 5:29 pm on May 15, 2011 (gmt 0)

I agree with Andem. Scrapers CAN hurt you despite your making smart moves and optimization. I do agree it's possible to focus too much on them, and it's wise to consider which ones are really capable of hurting you before spending effort on C&Ds (not that it takes much time to do C&Ds - I can fire off 5 in twenty minutes, and most of the time, they're complied with).

It's also important to consider other options for defeating scrapers. Since I switched from full to excerpt feeds, I've had no scrapers at all - just a handful of manual copies of my articles. Sometimes I C&D them - but sometimes I look at my article and see room for improvement, and just end up rewriting it significantly. (But I post both the original copyright date and the "updated" date on every page, just in case.) This tactic makes my site more valuable, which is what it's really all about, after all.

So yeah, they can hurt you, but drowning in C&D work isn't necessarily the best use of your time and energy.

scooterdude



 
Msg#: 4312293 posted 6:25 pm on May 15, 2011 (gmt 0)

The search engines used to be benign, now they are experimenting with 100% monetisation of their traffic flows, how much room does that leave for benign ness :)

While they was getting people into the habit of looking up stuff on their fav SE, they was benign

But why rank some blogger unknown to you, whose so called quality articles have not had rigorous peer review if you can direct enquiries to wikipedia or partner sites run by folk you know really well from way back or boardroom meetings with stellar attendees

suggy

10+ Year Member



 
Msg#: 4312293 posted 7:27 pm on May 15, 2011 (gmt 0)

Sorry Scooterdude, I don't buy the conspiracy theories. Don't confuse imperfection for intent. I suspect the overarching success of Wikipedia et al has more to do with flaws in the algo, than conspiracy.

This 37 message thread spans 2 pages: 37 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved