Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Panda related press releases

         

Johan007

6:16 pm on Apr 5, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have spent the last year or so going through all my editors articles looking for posts given to them as press releases from distributors. The method I used to find these was to query a sentence from every article (in quote marks) to find duplicates across the web.

I have now completed this; however, I am concerned that there may be the odd press release that is not shown as duplicated, perhaps historically there could have been duplicates. So if no duplicates are found today using my method Google may have already identified them as a Panda duplicates back in 2011 and since then all other sites had removed the content and I am left the only copy on the web. With perhaps a maximum of 1% of my articles as unique press releases should I be worried?

goodroi

11:15 am on Apr 6, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I am confused. You mention that you have spent a year cleaning up the site and taking care of duplicates. You then say 1% of your articles are unique press releases. What are the other 99% since you have cleaned up the duplicates?

martinibuster

12:46 pm on Apr 6, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Has there been an improvement in ranking or traffic?

As far as the remaining press release content, the less you have the better. Maybe a better way to handle that content is to quote a small portion of a press release (Acme Industries just announced the release of their improved Roadrunner Hammer 3000...) and then use that as a foundation of original content (but only if it's useful or of interest to site visitors or corresponds directly with products the site sells).

tangor

3:27 pm on Apr 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are these press releases from your distributors that you then sent out? Or are they press releases you sent out? Is your concern that there should be more hits for these releases at other sites and they are not there, per a Panda action?

A press release, by definition, is going to be duplicate content anywhere it appears ON THE WEB. Most sites/writers will use a press release to write their own story instead of just passing the release along unmodified.

If you are making the press releases and sending them out, and not seeing them proliferate (perhaps due to duplicate content Panda), then it is possible the other sites did make use of the press release and wrote their own story (or not)...

Is your concern that information is not out there, or that it has been removed?

Johan007

6:42 pm on Apr 6, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Apologies I did not explain this very well.

The press releases had been sent to my editors and had been mixed up with original content so not labelled as press releases. Other websites would also add the same press releases and they would show up as duplicates under the filtered results.

I have removed the duplicates from my site by identifying them by queering a sentence in quote marks in Google. Even when credited with the source I removed them. The last was removed this year.

However some press releases still remain on my site as I found 2 articles (out of 3000) my editors did not write but are unique articles in Google today. Since there are no other copies on Google it is extremely difficult to find other press releases like this on my website.

These other press releases are my concern. Will Google credit me as the source because there are no other copies or has there been other copies in the past but no longer show up in Google because other more savvy webmasters have already removed them before I had a chance to identify them as dupes and so will be penalised for them under Panda. I know I am hit with Panda as the dates coincide with drops.

@goodroi, Hi
What are the other 99% since you have cleaned up the duplicates?
1% unique to the web press releases, 99% unique content, 0% duplicate press releases (was 8%)

@martinbuster, Hi
Has there been an improvement in ranking or traffic?
No but there has been no Panda update this year and I removed the majority after Christmas.

@tangor, Hi
A press release, by definition, is going to be duplicate content anywhere it appears ON THE WEB
Not if everyone removes there press release first and I am left with the only copy. What I am dreading have to do is to contact every editor to "force" them to go through the articles and find the press releases as I am not that kind of person/dictator. Many articles are 7 years old and the editors have moved on.

tangor

7:00 pm on Apr 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I admit to still being confused. If you believe there is a problem with this content, then remove it. That simple. Otherwise, it sounds like you're afraid to remove it, and that doesn't make sense.

Johan007

7:14 pm on Apr 6, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



tangor, I cant find the remainders of these press releases because I can not identify them. They are not tagged so. That is the issue. As they are unique today I am hoping Google will not see them as press releases.

goodroi

8:39 pm on Apr 6, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Have you thought about using copyscape? It would cost a little money but you could run through all your urls to make sure they are unique.

Once you make sure they are unique, I would then look at file size. Very small files tend to have very little text and often not very useful/valuable to users. Google tends to dislike these pages.

After file size I would look at usage metrics. I wouldn't worry what metrics will make Google happy. Worry about which metrics are a sign of unhappy users. Unhappy users don't return and don't refer friends to your site.

Johan007

7:58 am on Apr 7, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



@goodroi, but if your the last website to remove the content then there will be no duplicates left on Google to query against. Unless I can query against a historic Google index from say 2011/12.

So I am hoping any remaining "press articles" on my site have always been unique rather than being flagged up as toxic in Panda. Perhaps being unique today means they are no longer a Panda concern.

martinibuster

1:12 pm on Apr 7, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Panda is a site quality algorithm. It's not the only site quality algorithm and duplicate content is just one of the many issues that can cause it to be identified as low quality.

Is it possible there might be something else? For example, a pattern of content that is optimized for keywords might cause a site to be recognized as low quality.

Sand

2:07 pm on Apr 7, 2015 (gmt 0)

10+ Year Member



Is it possible there might be something else? For example, a pattern of content that is optimized for keywords might cause a site to be recognized as low quality.


Most definitely. I had a site hit by Panda that had 100% original content. However, the mistake I did make was creating most of the content for search engines. The content itself wasn't spammy by any means, but I realize now that my approach was.

Once I took a step back and focused on the aggregate rather than the individual pages, it was pretty clear what I had to do.

tangor

6:43 pm on Apr 7, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you know the phrases you want to remove, why not search your own site instead of relying on a goog search? It now sounds like you wish to remove embarrassing content which might have spawned a Panda hit.

Look inward, not outward, for those answers.

Johan007

8:51 am on Apr 8, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thank you everyone. The replies did not get to crux of the issue (the last person left to remove duplicate content has nothing to query against to indemnify them because all other copies had been removed by now); however, you have all provided allot of food for thought and I have opened up the investigation again!

Its definitely a panda issue (used the barracuda-digital.co.uk penguin tool to triple check) and last year have moved the site from a classic ASP to a word press one being careful to use such sites as The Guardian for direction.

I am proud of the content on the website that gets citations from Wikipedia so I am not embarrassed (I get more traffic from Wikipedia than Google!). The odd rouge article estimated at >1% will have to be seeked out manually and assessed. Could take another year to be honest due having more than a couple of thousand pages. I am fatigued but have no choice but to keep on moving forward.

Thank you all.

martinibuster

12:03 pm on Apr 8, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



moved the site from a classic ASP to a word press one


Same URLs or redirected old to new? Redirects lose some link equity.

Did you check 404 errors for clues?

>1% - Does that mean less than 1% duplicate content? You may be solving a problem that no longer needs a solution. You may want to explore the full extent of on-page and site-wide issues that may trigger a Panda issue because it's possible there are other issues affecting it.

Johan007

12:13 pm on Apr 8, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Same URLs or redirected old to new? Redirects lose some link equity.


I have 301 redirected using htaccess to the new pages. Is there actual proof the 301 redirects lose some link equity? Anyway it’s an issue that has a solution.

Did you check 404 errors for clues?
Yes found the errors.

There is one page dedicated to advertisers on the site. the page is accessed from footer "follow" link but the external links on that page are to the advertisers do have "nofollow". Could this be an issue? This is not a factor of Panda as far as I understand.