| This 98 message thread spans 4 pages: < < 98 ( 1 2 3  ) || |
|Panda Loss Because of Scraped Content?|
My site was hit by Panda in April 2011. The site was created in 1999 - and all content is original, written by me.
The site ranked very well until April 2011. My question is - after improving the site for 18 months, I have seen no recovery. Ziltch. As a matter of fact - after all the improvements - the site got hit again recently by Panda 20 on September 28th.
So I continued digging around and trying to figure out the issue since Panda is all about duplicate content and low-quality content.
This is what I found - and what I'm wondering could be the issue:
1. I started checking content in Google for every page on my site. The whole copy and paste with quotations - using a unique sentence on each page and then doing a search.
I've done around 45 pages and the results are mind blowing. My content has been copied SO many times - it's incredible. Especially my really old content - like anything written 1999-2006. But not exclusively.
Some posts/articles have been copied 20+ times.
My site does NOT rank at all or ranks at the bottom for my OWN content
when I do these searches.
I submit DMCAs on everything I find and I am having some success. But I still have 400+ pages left to check.
My question: IS THIS SOMETHING PANDA WOULD HIT MY SITE FOR?
I honestly don't know - because ultimately it's MY content - I didn't copy it. And I don't know if this falls under the Panda penalty's actions.
My second question: IF IT IS CAUSING A PANDA ISSUE - CAN MY SITE COME BACK FROM IT?
Thank you in advance - I appreciate your time.
MrBreakEven: What angers me is that the scrapers are not just ranking higher for entire passages of my text, they are ranked higher for three word keywords too!
Imagine that. Ranked on the 1st page for major keywords but you who wrote the original content is no where to be found.
Just how do these scrapers acquire more authority than the originator?
I don't have a massive collection of backlinks, but how does a site with no backlinks using an old copy of my site manage to outrank me? Luckily I've managed to get the hosting company to suspend the offender's account, but Google still has 1300 cached pages, so the problem remains.
This is a huge waste of resources.
Google appears to be no different to a country that doesn't recognise copyright law. The only thing in our favour is the use of the DMCA, but by this stage it's too late, and the damage has been done.
I have a question about scraped contents as well. Recently I have noticed that new articles on my site get outranked by scrapers. I still have a Google News entry pointing to my site at the top, but the actual article on my site is found on the second page usually, outranked by eight or nine sites of which some only copy a paragraph, others the whole article. All link back to my content.
Is that because of the Google News entry? I checked past articles and they are ranking fine when I search for titles.
I'm wondering a couple things. Do you think it's just a matter of the person using your content having a better PR overall? Sounds simple but if they are viewed as a more important site, I'm considering whether it's being judged on that basis.
Are these situations happening on both search engines or on just one? I think if one search engine is struggling less than the other, then it needs some light shed on it. I don't think in the past we can say that the other search engine was doing a better job at anything. This may have changed in regards to scraped content. I think in that sense, if true, then make some noise about it via the scraper submission document. Communicate with Goog directly.
My situation is almost identical to Frost_Angel, expect I was hit with Panda 1 and has gone down ever since. Panda 20 completely took out all the rankings I had left. I then noticed in webmaster tools that the indexed pages are continuing to go down and exact searches for content on pages no longer show in the serps, where before Panda 20 they were at least there.
This is when I started to notice the scraper sites and feed sites ranking for my content. My internal pagerank must be extremely low because of the pages are are indexed, a search for a few sentences usually show #5 or lower in G Serps. Everything in Binghoo are as expected.
Another issue is that I a teamed up with a bigger site that has a blog full of snippets from other bloggers, basically they are an aggregator and always provide dofollow links back to the original article. Because they do this they always rank in place of my original work and my pages are not listed at all.
Now this part is interesting......this site when shown in the serps are displaying my authoring profile and there is not a rel=author on the page, there is a rel=publisher though. So somehow they are associating this with my profile and showing this page in the serps over the original article, from whom I am the rel=author. So I'm thinking that if you have the same content with the same author out there, then google gives the credit to the 'higher internal pagerank' page, and dumps the rest.
This is very aggravating and I'm thinking that the url has been placed in the dustbin and there is no hope for it. Almost two years of content 'improvements' only to see it almost completely removed from the serps.
I think I'm at the point of 301'ing the entire site off to the .net of the same brand. From what I've read this practice can be seen as 'blackhat'. But how is it fair that your content can't rank, but scrappers of the same content can?
Anyone with experience on 301'ing an entire domain?
|Anyone with experience on 301'ing an entire domain? |
Yes. Less than 2 weeks ago. It went from being a -50 EMD to a first page EMD. Old domain name was EMD oldregionkeyword-servicenamekeyword.com -- migrated to newregionkeyword-servicenamekeyword.com
The move was due to the owner actually moving geographical locations to an adjacent region, we weren't trying to run away from the beast. None the less, the beast doesn't know that. Yet, it was just ever so slightly over-optimized at the old domain name but it had 1st place in Google Places on page one so I didn't bother messing with it.
Seeing as how the move was going to cause a disruption anyway, I took it as an advantage to updated the 2 year old technology to HTML5 and made use of schema structured markup. I also deoptimized what I knew was wrong. Within just under 2 weeks she has taken the top spots across almost all search engines -- #1, #2, or #3. Google has it at bottom of page 1, spot #8 so far but they tend to take longer getting things done these days. But considering it was at -50 previously that's not bad -- pulled out of the filter.
On the old domain root I left a 301 in place to move all pages from old domain request to new one. No page names changed except the root EMD.
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
That's all that's left at the old address.
And whether or not it helped to distinguish it from scraped content or not I added the <link rel="canonical" href="http://www.example.com/+whatever-subfolders/+whatever-pagenames.html" /> canonical tag to each page that I had not previously used.
I also had a few backlinks to that site under my control and deoptimized them a bit and pointed them to new domain. So far all looks good.
|I think I'm at the point of 301'ing the entire site off to the .net of the same brand. From what I've read this practice can be seen as 'blackhat'. |
Oh my gosh people are getting so paranoid. Just do whatever needs to be done as if the beast didn't exist.
This has been my very recent experience but of course your results may vary.
MrSavage: Actually my site is a 7 year old site that has gained pagerank slowly over the years.
Over the years, we acquired much links from related bigger sites. I doubt that the scrappers (from china) have higher PRs than us.
I just seem to me that Google gives higher emphasis to newer sites.
|Do you think it's just a matter of the person using your content having a better PR overall? Sounds simple but if they are viewed as a more important site, I'm considering whether it's being judged on that basis. |
Theres a thread around here that references a test that shows exactly that.
The test was done publically and was written up reasonably well, including establishing some determining factors and limitations.
Here it is
| This 98 message thread spans 4 pages: < < 98 ( 1 2 3  ) |