1script - 7:28 am on Sep 8, 2011 (gmt 0)
So, I decided to check which pages they looked at before banning (de-indexing) one of my sites. I was hoping they've managed to find something really-really bad on my site and if I can find it by following their tracks, then remove this really-really bad thing, my consequent reconsideration requests will be more successful. One such request has already been rejected.
The tool used in the study: awk [the-art-of-web.com]. Got some great samples from that site.
Anyhow, after much awking, I worked out the code that seems to be grabbing the very logs I'm looking for: a visit from the Plex (by IP) that was an actual browser and not a bot. Also, I used superclown2's idea from here [webmasterworld.com] that the raters are coming in on Macs.
So I ran this on my August logs (both ban and recon request were last month):
awk -F\" '($6 ~ /Macintosh/)' *.com | awk '($1 ~ /^70\.90\.219|^70\.89\.39|^70\.32\.|^64\.233\.|216\.239\.|209\.85\.|199\.87\.|173\.194\.|^74\.125\.|^72\.14\.|^66\.249\.|^66\.102\./)' > ~/google_visits_on_Mac_most_IPs.txt
(it should be one-line command. Run it in your ~/access-logs directory, the resulting selection of human visits from Google will be in your home directory: ~/google_visits_on_Mac_most_IPs.txt )
So, anyhow, I was able to see a visit exactly 1 day (approx 25 hrs) before each of the events - ban and the response to my recon request. Both were from the 216.239.x.x subnet although earlier there were hits from other Google networks, too.
I was rather disappointed to see that before banning the site the rater visited a very drab and ordinary page on my site. Not a smoking gun of some incriminating evidence of a hacker break-in or some such I was looking for. Also disappointing is the fact that they visited one page only. I can't tell how long they have stayed on the page but can you make such a drastic decision about a 400,000+ pages site by looking at just one of those pages?
Probably even more disappointing yet is the way they treated the reconsideration request. A person came in and, indeed, only looked at a single page again. Only this time it was simply the homepage. My site is a forum, so the homepage contains pretty much only a list of the most recent threads - not much else to see there. At least the page they looked at before banning was representative of the layout (including ads layout which I hear they hate so much now). The only conclusion they could possibly have made by looking at the homepage and weighing my reconsideration request was that the site's still up. Apparently, that was enough to reject the request.
Anyway, hard data confirmed: your livelihood is in the hands of a typical overworked, disinterested American (IP geo) corporate employee. No surprise here...
Anyone want to fill in here about what raters are looking at on your site(s)?