Pfui - 11:55 pm on Dec 11, 2010 (gmt 0)
After more pragmatic (heh:) head-banging, two startling updates about:
1.) Google Web Preview (GWP)
To date, this UA continues to hit pages/files/etc., and without referers. Then today, exactly 45 minutes -- to the second -- after hitting one page and its files, GWP had a ref, a FAKE one:
Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13
Fake Ref? YES: http://www.freewebsitereport.org/www.mydomainhere.com
What the--? Anybody see refs of any kind? Or the same kind?
Also, on what I presume must be a related front --
2.) Google Web Tools (GWT)
If it's been a while since you checked your site(s) via Google's "site:yourdomainhere.com" feature (alt: "site:yourdomainhere.com yourdomainhere"), DO. IT. NOW.
This morning I discovered literally thousands of thou-shalt-never-crawl files in a score of thou-shalt-never-crawl directories wide-open in the SERPs despite being in robots.txt for years, and despite blocking Googeblot and numerous Google UAs, and googlebot.com, and google.com via dir-level .htaccess for years, just in case. That leaves only one UA from bare Google IPs...
The first domain I checked was a big one. Then I checked a much smaller one and again found loads of robots.txt-disallowed files and dirs. (I'll tackle other domains when my eyes aren't crossed from cross-checking to and from too many windows and tabs.)
It took me over an hour to systematically use GWT's URL Removal Tool for the www. form of the domains. Here's hoping I don't have to duplicate my efforts for the non-www version of sites. Here's hoping you don't find the mess I did!