Sgt_Kickaxe - 10:12 am on Jan 13, 2013 (gmt 0)
My site was never hit by Panda or Penguin either but I was lax in how I handled image hotlinkers and content scrapers so I did see an increase in links to my site using broken urls that, unfortunately resolved. Eventually I noticed that some pages were being outranked by these scrapers so I took DRASTIC action.
- I turned off my CMS function that was allowing badly formed urls to redirect and then I went and, using htaccess, made sure that if you didn't visit an EXACT url you would get a 404 on purpose. Overnight I created 20,000+ 404's which signified that I had a bigger scraper problem than I anticipated. ALL manner of broken url started showing up in GWT and traffic fell 50% immediately.
The thing is my ranked pages never faltered, they held their rank and the number of indexed pages hasn't changed, nor have the pages which were indexed. How I lost 50% of traffic from Google without my rankings changing is beyond me still.
Recovery - I'm still down 40% two months later, meaning 10% has returned. I don't care either, I rather like knowing that my site has zero low quality pages or urls that resolve or redirect which I don't want. It's super clean now.
I don't recommend this, Google will now index all of the urls you've blocked and will add a "due to robots.txt we can't tell you what's here" type description. If the page is 404 and you are returning a 404 error that's 100% what you want to be doing. No redirects, no blocking, just 404s. Perfect.
What I did to fix the 404s from the GWT is to block Googlebot from visiting the invalid pages using robots.txt