joined:Apr 3, 2002
Based on the original idea that gave it the name "Farmer" update, Google wanted to devalue pages that copy from other sites and aggregate for the sake of Rank + Ads (like MFA) = money and with no benefit for the surfer...
I want to give the Adsense team some kudos. When searching snippets from my homepage, I came across a junk scraper site. Now, they weren't just scrapying MY content. They were auto-scraping sentences and paragraphs from hundreds of sites and creating montage pages that made no sense whatsoever. A single paragraph might start talking about mortgage rates and end with animal hygiene. Total gibberish. Nevertheless, Adsense ads were covering the site. So, I filled out the Adsense violation form and pointed out the scraped nonsense that "violated the Google Webmaster Guidelines". The ads were gone within 24 hours.
This was a clear-cut case, so it made the task easy for the Adsense team. Not all scrapes are so easy to verify without a formal DMCA investigation. Nevertheless, if you spot such junk running Adsense, fill out their form.
Finally, I have a theory about having a blocked redirect file (houses your 302/redirected affiliate links), that is denied access to Googlebot via robots.txt. Over the years, my affiliate links came and went, and Google was finding the redirect urls throughout my site, but couldn't follow them because of the robots.txt denial. Even affiliate links that I deleted from the file years ago were still showing up last week in the site: search. I only have 4 active affiliate links, but Google was showing 30+. So, to eradicate those, I allowed Gbot access in robots.txt, 410'd the dead parameters, and submitted through the Google URL removal tool. Now, Google only shows the 4 active redirect links. It is POSSIBLE that because they were blocked, Google may have thought they were thin content (similar to tags).
We'll see what happens. For what it's worth, 5 out of 6 Pandalized affiliate sites that I examined were doing the same type of redirects, blocked to Googlebot. Many of the top remaining sites do not do this, including two of my own sites that actually gained position after Panda. By the way, I am not suggesting that the robots.txt denial in and of itself is bad, but the accumulation of dead/blocked urls may increase the "thinness" of the site. Make sure Google isn't indexing old/dead redirects that Googlebot can't follow.