MikeNoLastName - 11:52 pm on Nov 23, 2011 (gmt 0)
tedster: agree 100%
In our case this is only one portion of a large site, but G has indicated that with Panda, relatively small misbehaving sections can affect the ranking of an entire site.
After studying the forums here and in a couple other places (Google, SEO--) it seems agreed (me as well) that a lot of people who think they have been pandalized aren't being so for the reason they think. I also think a lot of the fault is G's, at least I'm pretty sure it was in our case. We had dropped from pg 1 to pg 30+ for ALL our primary keywords which previously G had us 6-packed for. After a week of research I determined the reason and we are now on our way back up higher daily and up to pg 3-6 in only a week.
In our case we discovered G had indexed duplicate copies of some of our pages, but it wasn't our fault in all the cases I could find. In one they had indexed example.com/abc.htm AND example.com/abc.htm/ pointing to the same page, so of course they were identical. As far as I know the later is not even a valid URL and certainly didn't come from our site links. I 301 redirected it in the htaccess and resubmitted it, so it got removed. In another case they were indexing a copy of our home page, which we use as a landing page just for adwords, and IS an exact copy of the home page (minus Adsense) but has had <META NAME="robots" content="noindex"> in it (Google's own documented preferred way of removing from the index) from the first day it was uploaded! They apparently picked it up from our sitemap since it is not linked anywhere else, except in Adwords searches, but I had to remove this through the webmaster tools page. In most cases these were nearly impossible to find since even when you do a site: search G literally HIDES them way at the end in the supplemental results, and I only managed to find most of them by shear luck. They often don't even appear when you search for a full unique paragraph and on site: only! I'm a little afraid of these new G generated titles being mistaken as duplicate pages by their own algo as I have seen our same URL coming up trice in the same search under both titles on different pages. G REALLY, REALLY needs a duplicated content report.
Anyway my point is the syndication/copying issue may not be as big an issue as thought, and what G was referring to as "duplicating content" may in fact apply ONLY to "on the same site", which we CAN control (although G doesn't make it easy with errors like this). I DO believe that Panda has drastically tightened the penalties on internal copying which is why we never saw any declines in Jan/Feb on our site (in fact we went up 35% then) but we (and likely others affected because of accidental or on purpose internal duplication) have seen drastic declines with each reported panda algo change. The difference being, in the past if you had a duplicate page by accident, only that one page pretty much dropped out of sight, now one or two key pages duplicated in the index can put your ENTIRE SITE off the SERPs with no easy way to figure out which page did it! At least in my experience. If anyone can't figure out why they have been Pandalized to page 30+ I would recommend G's indexing errors should be their first place to look, and their own directories for accidentally miscopied duplicate files.
Also, for the sake of possibly helping someone else stymied by their own stupid mistakes, we also realized on another affected domain that when we removed expired information, we were routinely keeping the URL (which was much linked from other sites) and replacing it with a 90% identical template stub each pointing to the same primary info page... Uh-Duh! Not intentionally intending to spam/cheat, just trying to help other webmasters who had linked, and to retain Backlinks, and in general NOT thinking (they add up over time)! Now we 301 redirect all the expired pages in that area to one single "Expired.htm" page to retain the backlinks while avoiding duplication.
Good Luck all, will report if/when we get back to prior rankings