Google raised their content duplication detection bar on this update - (deprecated) Google News Archive forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Google raised their content duplication detection bar on this update

Something new on this update, may help you if your site not showing up yet...

«
1
2

sohu8976

7:39 am on May 13, 2003 (gmt 0)

10+ Year Member

I am a foreign language SEO and over the years, more and more of my clients are paying more attention to google and their English Websites being spidered by this powerful SE. I have been studying google for some times now, and I have noticed something new on this update...

There have been quite a few of my clients disppeared from the sj, www3 sites... Although the update has not yet started (finished) yet, I strongly believe their site will not be included in this update. So, I digged in to find what's causing it. Over hours of exploration, I have found one similarity among these troubled clients' sites... They seems all have "copied" some contents from else where... However, some of them have the permission of the original author of the articles (mostly new scientific research articles)... Over the past months, I have mentioned to their web editors to make sure not directly using the contents from other web sites, and they have modified some of the content and layout quite nicely. And they have had no problem with Google. Until this mysterious update... IT SEEMS THEY ARE CAUGHT BY A NEW FILTER! So, my guess is google has spent quite some time to update their algo, and this new algo contain a much powerful content duplicate detector...

For those of you still can't find your sites on www3 and sj, maybe you want to make sure you do not have duplicated contents even they are copy right "legit" and HAD no problem before...

Please do not flame me, this is only my theory, it may or may not be true.

sohu8976

4:50 pm on May 14, 2003 (gmt 0)

10+ Year Member

:( Still no ones answers my question... This supposed to be the biggest SE board on the net...

Quest Restate:
"if a site is banned for duplicating contents, and just give up the domain, start a new one fresh, can he/she still use the clean (non duplicated) contents from the old site? Doe he/she need to shut down (or delete these clean content from) the old domain? "

But it's good to see many of you support my theory. :) I sure google's new filter can help the searchers more (this is what they care anyways), but the spammy results on sj just don't cut it... I hope GG is aware of this.

h_b_k

5:04 pm on May 14, 2003 (gmt 0)

10+ Year Member

sohu, may be it is because of my low english skills. :-)

my answer to you is: I think, there will not be a ban for duplicate contents. All the pages/sites will be in the index, but there will be a stricter algo to omit duplicate snippets.

jezzer72

7:11 am on May 15, 2003 (gmt 0)

10+ Year Member

I my case, the new SERPs are not just omitting the "duplicate" content, but my whole domain is removed. It's probably due to partially duplicate content copied from one of my sites to the other (maintaining theme relevance). I'm still clueless however, how this can result in such a severe penalty. If so, I'm gonna be extra carefull not quote anybody anymore on my websites, since google might judge this to be duplicate content. Again, I do not see the harm in copying content between my own sites.

I was planning to concentrate more on content then on optimization "trends", but I guess I'll have to keep investing precious time in reading and following google's "rules".

kovacs

8:03 am on May 15, 2003 (gmt 0)

10+ Year Member

what about duplication within a site e.g template pages with small variations between them? would this be considered duplication?

I would say yes. I have a domain that had 23K+ pages indexed, and this update it looks like it's down to a mammoth 71 pages. All of the pages share snippets in the body text and are designed with the same template. I also went a bit overboard with some of the internal navigation (ie identical lists of links on many pages).

bwebsitings

10:06 am on May 15, 2003 (gmt 0)

When i perform a search

"National Anthem God Save The Queen"

1,280 pages show that contain copies of the british national anthem. That's a lot of un penalized duplication

planbeta

10:25 am on May 15, 2003 (gmt 0)

10+ Year Member

If there's one thing for sure about these dupe filters is we have absolutely no idea how they function - we all least I don't.

For example, say you have a page with a portion of copied text and a portion of unique content. Now if someone searched for a keyphrase out of the copied part there's a fair chance your results may not be shown.

But what if someone did a search for keyphrase out of the unique part, would your page still be penalised for the copied part? Do these Dupe filters work per page? per site? per search? Does this make any sense?

h_b_k

1:00 pm on May 15, 2003 (gmt 0)

10+ Year Member

my idea: the duplicate contents filter works on snippets and not per page/site
it must be a filter which is working after the keyphrase search on the index, a filter only for search result representation

google is a search engine and not a copyright judgement system.

planbeta

1:39 pm on May 15, 2003 (gmt 0)

10+ Year Member

That would make sense! But when has anything ever made sense!? ;) It would seem silly for google to penalise a site with unique usefull content on just because it had a snippet on that wasn't unique.

h_b_k

2:43 pm on May 15, 2003 (gmt 0)

10+ Year Member

I have a site with pages working with different type duplicate contents.

It is a site with a thematic book list ...

for example there are:

A) duplicated structural information

- on each category page there is the same short paragraph duplicated from the homepage, which gives visitors a short information about the site motivation...

- duplications of navigation elements

- a common footer on each page

B) different views of information

- serveral books are sorted into more than one category
==> bibliographical information of a book are duplicated serveral times (at least twice, because there is also a page for each book)

- book reviews are also existing twice: first time at the book page of the book which is reviewed. and second time at a page with a compostion of all reviews from a reviewer...

I checked the google index: there is no banned page in my site, but serveral pages are omitted for the SERPs - which is not a problem for me, because at least one of all pages is listed for relevant keyphrases.

It could become a problem, if one other person would copy my site contents

wackmaster

3:11 pm on May 15, 2003 (gmt 0)

Also agree that Google can tag original content by date.

Regarding the new indexes (indecies?), we see in -sj and -fi that some well performing sites containing dup content are still doing fine...but the thing is, I wouldn't call their dup's spam....

This is tricky, because there can be very legitimate reasons for sites to dup content. "Reprinting with permission" is an old, well-established publishing convention, and I'm not sure that G could differentiate between dup content that was legit versus spammy dups...at least in their algo's.

What I do believe is that G may be applying some sort of dampening filter on pages with duped content (older than the tagged original content), but that's nothing new.

Good_Vibes

5:37 pm on May 15, 2003 (gmt 0)

10+ Year Member

This is not a dampening penalty we are discussing, this is Google removing the site from the database.

I have a something I would like to point out, the only sites of mine that were gray barred are all in high competition niches. The other sites are all fine. Are you people noticing the same thing?

deft_spyder

5:53 pm on May 15, 2003 (gmt 0)

10+ Year Member

"deft_spyder:
Could you please clarify your points? Are you saying duplicate contents from the sites you listed will be a problem?"

The list of examples i gave were all perfectly legitimate reasons for content to appear on different websites. One of my clients is a very large movie preview site. We get all our reviews from a content management service. Should we be penalized for that?

What if a doctor writes an essay, has it on his site, and give me permission to post it on my medical website? Should I be penalized?

That is why I see problems with either a policy like the one you describe, IF they really are doing it. (lets all remember noone can confirm it, its just conjecture).

Good_Vibes

1:32 pm on May 16, 2003 (gmt 0)

10+ Year Member

I agree it is conjecture, but there is no doubt that sites are being removed. I want this discussion to bring up theories as to why.

My point was this:
If you have a movie review site (assuming the movie review sites niche is very spammy), and your reviews are similar to your competitors, I would suggest that Google would remove one of your sites from their database (i.e. gray bar).
It would be better for searchers not to find 2 sites in the top 10 that were very similar, thus it would be better for Google to do so.

That is my theory about the new Google stict algo.

h_b_k

7:08 pm on May 16, 2003 (gmt 0)

10+ Year Member

to remove the two equal reviews from the serps, it is not necessary to remove one of the pages from the index.
it is only necessary to remove one of the search results from the output.

This 44 message thread spans 2 pages: 44

«
1
2