incrediBILL - 9:45 pm on May 6, 2013 (gmt 0) [edited by: incrediBILL at 2:02 am (utc) on May 7, 2013]
If you think your site got PENGUINized, opposed to mine that got BettyDavisEyesed, and you think it was a link scheme involved let me put forth this scenario of how Google probably finds those link schemes because they're trivial to spot if you're Google.
Think of Google like the USGS but instead of seismic monitors, they have traffic monitors installed all over the web including, but not limited to, bandwidth backbones itself.
1. Google knows where searchers go from Google by tracking their clicks.
2. Google knows where any traffic came from landing on sites with AdSense/Analytics/etc.
3. Google now has monitors all over the web for G+ "likes" or whatever.
4. Google 'safe surf' and suggestions used in several browsers tell them where you're going even when you're not in Google.
5. Google PR in the toolbar or SEO tool you always use also tells Google what sites you visit or investigate for SEO purposes. SEO tools most likely out SEO's back link buying because of the UA it uses, more likely it's odd and easily spotted activity so they know already where you're buying links from this activity alone but I'm ahead of myself.
Now, given just what I've mentioned above Google has a pretty good idea which directories on the web have traffic landing on them and leaving them to other sites so they know if it's a sham site or not and there are many.
They also know if there are sites out there nobody uses except to build cross linking for PR because nobody uses these sites, they generate no traffic and the links generate no traffic.
Basically, Google is like Santa Claus, they've been making a list and they know if you're being naught or not and they know if your site is an SEO sham or not.
I would guess they err on the side of caution and even sites with a reasonable trickle of traffic don't fall into the bad link list, but the tens of thousands of SEO trash sites stick out like a sore thumb when you run a report comparing all sites around the world that generate traffic in descending order of volume and all these sites are at the very rock bottom.
Forget bounce rate, you would have to have someone bouncing from them in the first place.
This is all I would do, as described above, to find all the link scheme sites out there buying and selling links because no matter how hard you try to hide your activity you can't hide a lack of inactivity.
Now comes the fun part, if they have any way of tracking the people going to and from those bad sites, most likely just the people looking for links and buying them, they've also tracked you and all your other traffic sources assuming you left them a trail to follow and didn't use a TOR proxy as you did your link buying.
Additionally, a lot of these link scheme sites had unrelated links on the same page and trends of pages with scrambled content links makes no sense as crochet and paintball don't mix for example so those kinds of things are easy to spot, or some sites that always put the sold links in the same spot on the bottom of thousands of sites and it's always off topic, etc. Lots of profiles to identify, build a list of those profiles, click <RUN> and VOILA! Penguin.
Then they give everyone a disavow link tool and like a bunch of scared kids in school when one guy gets busted for pot, everyone rats on everyone and even the sites they didn't suspect of selling links are being outed. Nice.
People in the past have scoffed that Google could know about all their tricks but I contend Google is a good angler and instead of reeling in all the misbehaving SEO fish too early, they've given them lots of line to run and run before now reeling them in.
More likely they didn't have a coherent platform to assimilate everything properly and the sheer volume of data in disparate formats made it unmanageable but as we've noticed on the front end with all the privacy merges and more associated with G+ the same has probably been going on internally with the backside and this is the result.
If history has taught us nothing else it was BEWARE GrEEKS BEARING GIFTS and all those freebies, including AdSense money, turns out to possibly be one big elaborate trap to control how the rest of the world behaves on the web to make sure AdWords revenue is maximized by eliminating all possible competition.
[edited by: incrediBILL at 2:02 am (utc) on May 7, 2013]