Forum Moderators: Robert Charlton & goodroi
I have another site on which I've done the same thing, although with fewer manufacturers' products. That site has been online for well over two years, and ranks in the top five for almost every manufacturer's product (ie, "Acme Hammers").
My site also has press releases which, of course, have an even higher rate of similarity.
If that rate of duplicate content/similarity were the cause for my site dropping like a rock in the rankings, then wouldn't it follow that the older site I have would be penalised as well?
My site was moving up nicely in the rankings until the editors at ODP decided that it wasn't in the correct category, and eliminated the site from the directory. On that very same day, my site disappeared from Google's directory, and on that same day my pages dropped dramatically in the SERPs.
I've emailed Google to ask if my site is being penalised for duplicate content. As soon as I get a reply, I'll post it.
you aren't going to get an asnwer, it's their policy not to comment.
I have a question for ya, If a duplicate content
filter is tripped on the main page of your website,
index.htm, will the "penalty" it causes affect
all the additional pages in your site.
I would guess that if the PR of the index suffers in the process, then since most internal pages receive their PR via the index/default, it could do damage right across the board. This assumes that PR actually means something in the serps, of course.
On-topic: we have pages of our own original content that are textually 90% identical to pages used as articles, (donated by us), that are posted on other sites. The page templates are totally different though. Both their pages and ours show up fine, although ours do somewhat better because of higher PR, (apparently).
The dup index.htm is likely what killed me. ALL pages got penalized. Most visitors were coming to internal pages so it was a shock.
I got rid of the duplicate content, got rid of the homepage by renaming to index.html and am now waiting for it to be 'found' by google. A by product of this is I've also got rid many hundreds of spammers linking in.
I'm starting to see a great deal of google bot activity during the last 36 hours so I am hopeful.
Toolbar PR in no way recognises the application of duplicate content filters or any other filter/'penalty', for that matter, what so ever.Infact, TB PR is probably one of the most misleading indicators relied upon by the SEM industry that exists today.
You might be right. Now that you mention it, I recall having our index page disappear a couple of years ago thanks to an incoming link without the www, (which I then fixed), and inner pages were still in the serps. All the same, G seems so neurotic these days that I'd rather not have it happen again.
Another interesting point raised is that of only pages being banned, I assumed that a penalty ban affected teh entire site and not just a page.
Here's the reply I received today from Google:
<snip>
***************
Well, I've been following these guidelines for years. In fact, I still have sites that are in the top five for search results, all thanks to Brett's 26 rules primer.
If going from #22 to #495 for a particular keyword search isn't a penalty, I'd like to know what is.
Nevertheless, I'm just going to sit pat and see what develops.
[edited by: Woz at 7:49 am (utc) on Feb. 25, 2005]
[edit reason] Sorry, no emails, please paraphrase - See TOS#9 [/edit]
I think G has tightened the dupe penalty a LOT. if you have a huge template and only have a few paragraphs of unique text, I think you're toast because your pages are too similar. Google apparently doesn't realize that not every page can contain 2000 or more words. Not everyone has a forum or news stories.
Perhaps I might raise a few more points.
My site isn't huge: it's roughly 1500 pages, of which approximately 1,000 are pages that show descriptions, specifications and photos of specific manufacturers products. As I said before, the similarity varies from 37% to 58%.
Another section of my site is devoted to legislative issues regarding my product niche. As I said, these are reprints of press releases.
And this is where it gets interesting. There's a well-known site out there that sends out your press releases nationwide....for a fee. $495 for nationwide distribution. This site has a PR of 8, including the individual PR releases. They're PR8 as well.
I ran a comparison between the "My Political Group" press releases on my site, the groups' press release on their own sites, and the press release on this PR8 site. The content similarity was highest for the PR8 site at 90%.
Another WW member stickied me with a link to his competitors for his phrase "Acme Model F850 Digital Wonder."
I was surprised that the top two results for that phrase were sites that were 88% similar in context to other sites, including the manufacturer.
If I have to go and do re-writes of roughly 1,000 pages, then I guess I'll have to. But, seeing the Google results for pages that are nearly identical, I have to wonder if that's where the penalty lies.
I also believe its kind of a duplicate content issue. I do have a good argument for that. Some weeks ago i changed my navigation to not longer include a TopicID. I am running a software website with thousands of product listings - each one contained a ProductID and a TopicID. The TopicID was used to open up the navigation eg. TopicID=234 was Internet and TopicID=543 was Tools.
I had very nice rankings - but doscovered that the Bot has to work a lot ... it was possible to have the same product in several categories which leaded to urls like that:
product.asp?TopicID=234&ProductID=400
product.asp?TopicID=543&ProductID=400
So the bot read the same product twice and wasn't able to predict which products received which links internally. I changed the url generally to:
product.asp?ProductID=400
The positive effect? The bot is now able to read in all the products and the internal link structure is much cleaner. The negative effect? Dropped from the rankings and i assume Google is no longer able to recognize my pages as unique. Without an opened navigation for each page, the weight of the pages have somewhat gone... because they now all look kind of similar - only the product description differs.
Well, looks like i shot myself into the foot...
itloc
I emailed google about the same issue 2 weeks ago. Many of my listings disappeared with this last update. I sent them an email (at their help@google address). I had a reply within 24 hours stating that my domain was not banned or penalized at all. They were just telling me that their results change all of the time
They didnt give me any insight as to why my site got affected, but it was an answer that I truley appreciated.
After Allegra i assume. However its difficult to say what caused the drop exactly. The cleaned listing have ranked very well for around 2 weeks (page 1) - then the site count began to fall down, the backlinks have decreased and Allegra started to really kick in.
The idea was just to expose the content that i have a little better - to decrease the necessary load for the bot. Thought that was a good idea - from a search engines perspective.
I will wait for around 4 weeks and then see whats happening - so far i lost thousands of page 1/2 kws ...
Regards
itloc
see message 12. I Should've clarified it back then
Didnt want to start a new thread so I'll post question in here.
If G! or Y! has given you duplicate penalty how do you know?
My site is new but after reading posts in here on duplicate stuff I'm thinking maybe my site is classed as duplicate content.
In google tool bar I'm 0/10 if I had some kind of penalty with G! would that go grey?
Welshy
Is 54% still too high? How are these site escaping the dupe content filter?
Honestly, we don't know and we all just speculating. At least the people posting here. GoogleGuy knows but he isn't stupid to post that here. 54% causing a dupe penalty? I doubt it but what do I know. If this is true, then G has to say that siteX is original and everyone else is a dupe. If you have too many "dupes", you'll not even rank for yourname. Maybe those sites are "authority" sites or something. Technically, Yahoo News and 99.99% of newspaper sites should be penalized too since the same story (I'd guess that 80+% of them are AP, Reuters, AFP, UPI, NYT etc.) appears on 1000+ other newspaper sites.
If true, G has opened a pandora's box. I can think of many sites that have short movie, script, website or product reviews, jokes, quotes, pictures with captions, definitons or information which can be just a few sentences (doesn't have to be 500 words long to be useful).
If you like, and if it's allowed under Webmaster World's terms of service, sticky me and I can give you an example.
Just about all of the posts about sites disappearing during Allegra are conjecture. But there seems to be some commonality.
Some have high duplicate content percentages. Many, if not most, are new sites.
Could it be that the dupe content filter is being applied only to new sites?
I just made the same search with filter0, but then it stoped at 900, so I could not see the rest 11mill. results, whats going on, I said with omitted results.