Welcome to WebmasterWorld Guest from 35.171.146.16

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Duplicate content filter

At which point does the filter get applied?

     
8:24 pm on Feb 21, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member essex_boy is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 19, 2003
posts:3212
votes: 17


If I have 10% duplicate content, for example reprints of famous interviews with sports stars, would the filter be applied?

or does your duplicate content level have to be much higher say over 50%?

11:36 am on Feb 22, 2005 (gmt 0)

New User

10+ Year Member

joined:Nov 18, 2004
posts:40
votes: 0


I don't think 10 or 20 % duplication will effect

Pradeep SV

10:08 pm on Feb 22, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


My site has many, many pages of product descriptions and specifications from particular manufacturers. Thus, the pages are 37% to 59% similar to the pages on the manufacturers' sites.

I have another site on which I've done the same thing, although with fewer manufacturers' products. That site has been online for well over two years, and ranks in the top five for almost every manufacturer's product (ie, "Acme Hammers").

My site also has press releases which, of course, have an even higher rate of similarity.

If that rate of duplicate content/similarity were the cause for my site dropping like a rock in the rankings, then wouldn't it follow that the older site I have would be penalised as well?

My site was moving up nicely in the rankings until the editors at ODP decided that it wasn't in the correct category, and eliminated the site from the directory. On that very same day, my site disappeared from Google's directory, and on that same day my pages dropped dramatically in the SERPs.

I've emailed Google to ask if my site is being penalised for duplicate content. As soon as I get a reply, I'll post it.

10:21 pm on Feb 22, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


"I've emailed Google to ask if my site is being penalised for duplicate content. As soon as I get a reply, I'll post it. "

you aren't going to get an asnwer, it's their policy not to comment.

11:25 pm on Feb 22, 2005 (gmt 0)

Preferred Member

joined:July 19, 2002
posts:415
votes: 0


I have a question for ya, If a duplicate content
filter is tripped on the main page of your website,
index.htm, will the "penalty" it causes affect
all the additional pages in your site. IE, will one
filter penalty trickle down, like PR does to all pages
within your site, hurting them as well.
11:57 pm on Feb 22, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 4, 2002
posts:1687
votes: 0


I have a question for ya, If a duplicate content
filter is tripped on the main page of your website,
index.htm, will the "penalty" it causes affect
all the additional pages in your site.

I would guess that if the PR of the index suffers in the process, then since most internal pages receive their PR via the index/default, it could do damage right across the board. This assumes that PR actually means something in the serps, of course.

On-topic: we have pages of our own original content that are textually 90% identical to pages used as articles, (donated by us), that are posted on other sites. The page templates are totally different though. Both their pages and ours show up fine, although ours do somewhat better because of higher PR, (apparently).

12:32 am on Feb 23, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 22, 2001
posts:781
votes: 0


Toolbar PR in no way recognises the application of duplicate content filters or any other filter/'penalty', for that matter, what so ever.

Infact, TB PR is probably one of the most misleading indicators relied upon by the SEM industry that exists today.

12:56 am on Feb 23, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 1, 2004
posts:1987
votes: 0


Tweb,

The dup index.htm is likely what killed me. ALL pages got penalized. Most visitors were coming to internal pages so it was a shock.

I got rid of the duplicate content, got rid of the homepage by renaming to index.html and am now waiting for it to be 'found' by google. A by product of this is I've also got rid many hundreds of spammers linking in.

I'm starting to see a great deal of google bot activity during the last 36 hours so I am hopeful.

4:03 am on Feb 23, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 4, 2002
posts:1687
votes: 0



Toolbar PR in no way recognises the application of duplicate content filters or any other filter/'penalty', for that matter, what so ever.

Infact, TB PR is probably one of the most misleading indicators relied upon by the SEM industry that exists today.

You might be right. Now that you mention it, I recall having our index page disappear a couple of years ago thanks to an incoming link without the www, (which I then fixed), and inner pages were still in the serps. All the same, G seems so neurotic these days that I'd rather not have it happen again.

7:28 am on Feb 23, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member essex_boy is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 19, 2003
posts:3212
votes: 17


So what your saying then is that as long as the majority of the site is of original content having some duplicat content press releases etc, you shouldnt then get a penalty applied. Is that correct?

Another interesting point raised is that of only pages being banned, I assumed that a penalty ban affected teh entire site and not just a page.

3:41 am on Feb 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


walkman, I always value your comments on this forum.

Here's the reply I received today from Google:

<snip>

***************

Well, I've been following these guidelines for years. In fact, I still have sites that are in the top five for search results, all thanks to Brett's 26 rules primer.

If going from #22 to #495 for a particular keyword search isn't a penalty, I'd like to know what is.

Nevertheless, I'm just going to sit pat and see what develops.

[edited by: Woz at 7:49 am (utc) on Feb. 25, 2005]
[edit reason] Sorry, no emails, please paraphrase - See TOS#9 [/edit]

4:35 am on Feb 25, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


dickbaker,
I actually meant that they will not give you a reason, unless you're banned (manually), which is very rare. If you're in the SERPS, even in page 4251454, they will send you that type of e-mail. If you're banned, they'll tell it's because we couldn't assign page rank and see the guidelines.

I think G has tightened the dupe penalty a LOT. if you have a huge template and only have a few paragraphs of unique text, I think you're toast because your pages are too similar. Google apparently doesn't realize that not every page can contain 2000 or more words. Not everyone has a forum or news stories.

7:00 am on Feb 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


walkman, I really value your input in these discussions as we all try to figure out what the heck is going on.

Perhaps I might raise a few more points.

My site isn't huge: it's roughly 1500 pages, of which approximately 1,000 are pages that show descriptions, specifications and photos of specific manufacturers products. As I said before, the similarity varies from 37% to 58%.

Another section of my site is devoted to legislative issues regarding my product niche. As I said, these are reprints of press releases.

And this is where it gets interesting. There's a well-known site out there that sends out your press releases nationwide....for a fee. $495 for nationwide distribution. This site has a PR of 8, including the individual PR releases. They're PR8 as well.

I ran a comparison between the "My Political Group" press releases on my site, the groups' press release on their own sites, and the press release on this PR8 site. The content similarity was highest for the PR8 site at 90%.

Another WW member stickied me with a link to his competitors for his phrase "Acme Model F850 Digital Wonder."

I was surprised that the top two results for that phrase were sites that were 88% similar in context to other sites, including the manufacturer.

If I have to go and do re-writes of roughly 1,000 pages, then I guess I'll have to. But, seeing the Google results for pages that are nearly identical, I have to wonder if that's where the penalty lies.

8:13 am on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 7, 2004
posts:139
votes: 0


Just my 2 cents...

I also believe its kind of a duplicate content issue. I do have a good argument for that. Some weeks ago i changed my navigation to not longer include a TopicID. I am running a software website with thousands of product listings - each one contained a ProductID and a TopicID. The TopicID was used to open up the navigation eg. TopicID=234 was Internet and TopicID=543 was Tools.

I had very nice rankings - but doscovered that the Bot has to work a lot ... it was possible to have the same product in several categories which leaded to urls like that:

product.asp?TopicID=234&ProductID=400
product.asp?TopicID=543&ProductID=400

So the bot read the same product twice and wasn't able to predict which products received which links internally. I changed the url generally to:

product.asp?ProductID=400

The positive effect? The bot is now able to read in all the products and the internal link structure is much cleaner. The negative effect? Dropped from the rankings and i assume Google is no longer able to recognize my pages as unique. Without an opened navigation for each page, the weight of the pages have somewhat gone... because they now all look kind of similar - only the product description differs.

Well, looks like i shot myself into the foot...

itloc

8:27 am on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 11, 2005
posts:195
votes: 0


walkman
just to reply to
"you aren't going to get an asnwer, it's their policy not to comment."

I emailed google about the same issue 2 weeks ago. Many of my listings disappeared with this last update. I sent them an email (at their help@google address). I had a reply within 24 hours stating that my domain was not banned or penalized at all. They were just telling me that their results change all of the time
They didnt give me any insight as to why my site got affected, but it was an answer that I truley appreciated.

8:56 am on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 7, 2004
posts:139
votes: 0


sandyeggo...

I did the same ... and received probably the same email. I have to admit, their automated messages look great and give you the warm feeling that the worlds numer 1 search engine cares... :-)

itloc

9:50 am on Feb 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member essex_boy is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 19, 2003
posts:3212
votes: 17


itloc thats an interesting point. Did teh filter kickin before or after the latest update?
10:20 am on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 7, 2004
posts:139
votes: 0


Hi Essex_Boy

After Allegra i assume. However its difficult to say what caused the drop exactly. The cleaned listing have ranked very well for around 2 weeks (page 1) - then the site count began to fall down, the backlinks have decreased and Allegra started to really kick in.

The idea was just to expose the content that i have a little better - to decrease the necessary load for the bot. Thought that was a good idea - from a search engines perspective.

I will wait for around 4 weeks and then see whats happening - so far i lost thousands of page 1/2 kws ...

Regards

itloc

11:19 am on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 7, 2004
posts:139
votes: 0


and the latest trend:

After Gbot hammered my site and read around 20'000 pages in two days - i lost 7000 in the index. Could be a temporary thing...but if it sticks ... im not amused then...

itloc

2:17 pm on Feb 25, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


"They didnt give me any insight as to why my site got affected, but it was an answer that I truley appreciated. "

see message 12. I Should've clarified it back then

5:06 pm on Feb 25, 2005 (gmt 0)

New User

10+ Year Member

joined:Feb 2, 2005
posts:14
votes: 0


Hi,

Didnt want to start a new thread so I'll post question in here.

If G! or Y! has given you duplicate penalty how do you know?
My site is new but after reading posts in here on duplicate stuff I'm thinking maybe my site is classed as duplicate content.

In google tool bar I'm 0/10 if I had some kind of penalty with G! would that go grey?

Welshy

5:51 pm on Feb 25, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:July 5, 2004
posts:470
votes: 0


I'm wondering what sort of standards Google might apply to dupe content within a site. About 100 of my pages show about 54% similarity. My site has been sandboxed since Serpt. 23rd. The strange part is, my competators pages (all similarly constructed) are upwards of 90% similar, with just a few keywords being replaced on each page and they continue to be ranked #1, #2, etc. When is dupe content then an issue?

Is 54% still too high? How are these site escaping the dupe content filter?

6:07 pm on Feb 25, 2005 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


"When is dupe content then an issue? "

Honestly, we don't know and we all just speculating. At least the people posting here. GoogleGuy knows but he isn't stupid to post that here. 54% causing a dupe penalty? I doubt it but what do I know. If this is true, then G has to say that siteX is original and everyone else is a dupe. If you have too many "dupes", you'll not even rank for yourname. Maybe those sites are "authority" sites or something. Technically, Yahoo News and 99.99% of newspaper sites should be penalized too since the same story (I'd guess that 80+% of them are AP, Reuters, AFP, UPI, NYT etc.) appears on 1000+ other newspaper sites.

If true, G has opened a pandora's box. I can think of many sites that have short movie, script, website or product reviews, jokes, quotes, pictures with captions, definitons or information which can be just a few sentences (doesn't have to be 500 words long to be useful).

7:46 pm on Feb 25, 2005 (gmt 0)

Full Member

10+ Year Member

joined:July 11, 2003
posts:276
votes: 0


After being in the supplemental hole late 2003 I change all pages that had duplicate content on them to reflect unique content in the first paragraph and well no more supplement hole for the product pages. From reading on googles site they state that pages that contain duplicate content in a site would be placed in their supplemental index and called forth when no other results are available, at least that's my understanding of what they do with the duplicate content pages within a site.
8:38 pm on Feb 25, 2005 (gmt 0)

New User

10+ Year Member

joined:Feb 2, 2005
posts:14
votes: 0


But how does everyone know if they have duplicate content?
Is there a some kind of tool to check?
10:15 pm on Feb 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


welshy, do a search for something like similar content checker or words to that effect.
10:21 pm on Feb 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


webtress, if pages that are too similar are placed in the supplemental index, then why would my home page--which has zero duplicate content--also be in the supplementals? In fact, every single page on my site is in the supplemental index, similar content or not.
10:49 pm on Feb 25, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 14, 2005
posts:94
votes: 0


What about running a news site with press releases from companies and Institutions, where you have almost more than 90 % duplicate content? Do you think that kind of site will be eloped from google's serps?

Thanks

12:09 am on Feb 26, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 14, 2003
posts:1783
votes: 0


perfectlover, check out one of my last posts (#13) on this topic. There's many sites that do nothing but republish press releases and have 90% similar content.

If you like, and if it's allowed under Webmaster World's terms of service, sticky me and I can give you an example.

Just about all of the posts about sites disappearing during Allegra are conjecture. But there seems to be some commonality.

Some have high duplicate content percentages. Many, if not most, are new sites.

Could it be that the dupe content filter is being applied only to new sites?

1:10 am on Feb 26, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member zeus is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 28, 2002
posts:3468
votes: 18


I just made a search on google, which included 11.000.000, but only 900 results where shown, the rest where omitted, that is just a sign of that just dont have control over there dublicated, near dublicated content as many original sites has been filtered out be cause of redirecting sites.

I just made the same search with filter0, but then it stoped at 900, so I could not see the rest 11mill. results, whats going on, I said with omitted results.

This 34 message thread spans 2 pages: 34
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members