Google Penalizing Original Content - copy sites rank higher - Google Search and SEO forum at WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Penalizing Original Content - copy sites rank higher

macman23

5:41 am on Feb 27, 2007 (gmt 0)

One of my websites contains a glossary of terms and definitions that I have written from scratch. When I copy a sentence from one of my definitions and put quotes around it in Google, there are 13 results. However, it only shows the top 11. My site doesn't even show up in the list unless I click the "Show omitted results" option. Then it ranks at the bottom.

No wonder my traffic from Google has tanked. I have invested hundreds of hours researching and writing these definitions and over 10 sites that have DIRECTLY COPIED my content rank higher than mine. And my website seems to be penalized somehow and filtered to the bottom. You can imagine that I am a little frustrated.

Any ideas on how I can remedy the situation?

nippi

12:56 am on Mar 11, 2007 (gmt 0)

I think you are fine to use the content of the manufacturer, and link back to them, as long as you add value to the original content.

soapystar

10:36 am on Mar 11, 2007 (gmt 0)

quadrille

I do not think - and did not say - a serps where writing original content gives that page a boost over a page scraping it for that content is 'worrying', I said your call for Google to be the net police was worrying, as 90% of your posts suggest you do not like or trust Google - so no need to be perplexed, just read what I actually said

I like Google. Its still the best search engine. Its the only forum i really post in consistently since thats how much i care about Google. Discussing areas where you feel there are issues is different from not liking something. I have an interest in the search engine and thats why im motivated to participate in this forum.

Yes you did take a posting about original content, transpose that into a copyright issue and say it would be worrying for Google to do that.

I did not call for Google to be the net police or anything like it. I said i would like to see the original page for content rank above a page scraping it.

Quadrille

11:45 am on Mar 11, 2007 (gmt 0)

Yes you did take a posting about original content, transpose that into a copyright issue and say it would be worrying for Google to do that.

Sorry if I did not make it clearer; discussions about 'original content' are never far from the copyright issue. To call for Google to make decisions on 'original content' would be a legal minefield for Google; they could not do that without considering who 'owned' the content - ie the copyright issue.

I really do not see how the two issues cannot be kept apart. Surely, the only protection the 'first user' has is though copyright law?

You are absolutely right that "In most circumstances pinning first cache to a site will accurately label the original content writers." - but Google would be on dicey ground making that assumption; on the occasions they got it wrong, they could (and would) be sued. For example, suppose I copied a report from a bricks'n'mortar magazine onto my site, and the magazine followed a week later? I could happily claim first use, but copyright law, correctly, would override that claim. And Google using cache dates would be quite wrong.

Such decisions are not Google's to make - unless you believe that Google has a role as net police - and I accept your statement that this is not your intention. But without such authority, Google cannot do it.

When there's a dispute now, Google only intervenes on the basis of "on pain of perjury" statements; they do not want to be involved in any court case on 'first use'.

But I'm happy to agree to differ on these matters. Until my content gets stolen ;)

soapystar

2:34 pm on Mar 11, 2007 (gmt 0)

. To call for Google to make decisions on 'original content' would be a legal minefield for Google;

Not at all. The whole point of a search engine is to make decisions regarding how it ranks sites against each other. Deciding that a segment of the algo will rank sites that were first to show a particular snippet of content before a site that has the same content but where that content appeared at a later date, is no different from any other element of the algo. It is not an attempt at identifying copright, its simply a decision on which factors will boost a page over another. Indeed im sure it would have the side affect of being an effective spam filter. Having said my post was within a certain context. The context really being that spammy sites with stolen content are ranking higher than the original site. That I would suggest shows an underlying problem. Its clear when wikipedia shows above original content sites why this is happening, but when its quite common for spam sites to rank higher something seems flawed within the current algo.

Quadrille

2:55 pm on Mar 11, 2007 (gmt 0)

Deciding that a segment of the algo will rank sites that were first to show a particular snippet of content before a site that has the same content but where that content appeared at a later date, is no different from any other element of the algo.

Except (as I've explained) Google could be writing copyright theft into it's algo. Which would be "a legal minefield for Google".

It is not sufficient to recognise 'first web publisher', as that person may be a copyright thief, either by stealing from non-web sources (as I explained) or maybe because the original publisher had opted for no cache.

I really don't know how else to explain all this; it simply isn't a matter of 'recognising who was first'. Google, if they agreed to run such a scheme, just cannot ignore who 'owns the copyright'; that's what publishing is all about. Not just algos and who goes first, but who owns it.

Sorry, I recognise that I've not been able to explain this to you, but I have really tried :)

soapystar

5:59 pm on Mar 11, 2007 (gmt 0)

Except (as I've explained) Google could be writing copyright theft into it's algo

But while you present looking for first cache of content as undeniably an attempt at identifying copyright i just dont agree at all thats the case. IMHO and at least one other senior poster in this thread, everything i have mentioned has nothing to do with copyright. Clearly we disagree on a fundamental point on which everything else revolves.

Quadrille

7:37 pm on Mar 11, 2007 (gmt 0)

you present looking for first cache of content as undeniably an attempt at identifying copyright

Yup; that's a fact. I can't see any way to avoid it. You mention content theft yourself; you even mention intellectual property, so I really cannot understand why you attempt to separate copyright from your rights to original content.

No copyright, no right to prevent copying, surely?

But, as I've said before, I'm perfectly happy to agree to disagree ;)

And I have to opt out now, going on Hols. :)

Nimzovich

8:15 pm on Mar 11, 2007 (gmt 0)

The general feel is that this happens because your pages are under a demotion or penalty.

This is also my experience.

Marcia's case is different; I think it happens when the two domains have similar reputation.

soapystar

11:03 pm on Mar 11, 2007 (gmt 0)

you even mention intellectual property

yes of course that was a totally different post. It was in response to some messages about dcmas and i said i try to solve stuff with pre-emptive emails. And because not everyone works under USA law i throw in catch all terms to cover a range of situations. Its a template email.

So boosting first cache pages and how you respond to sites copying your work are two separate issues i addressed in different messages.

CainIV

4:20 am on Mar 12, 2007 (gmt 0)

Having such a link in the body copy seems like one good practice to fight content theft, especially theft of the automated kind. That's not something I tend to do, but I'm starting to adapt. Maybe it's time to place a "permalink" on every page, whether it's a blog or not!

Bingo, and if you are concerned about increased anchors from those scraped websites then only link to yourself using your url as the anchor and not any keyword.

Theft of the automated kind accounts for a large percentage stolen content.

May also help with other canonical issues as well...

This 70 message thread spans 3 pages: 70