Forum Moderators: phranque
... a real or imaginary practice of google in caching web content in certain locales to improve access speed, ...
So if you're concerned about scrapers hitting your site via the Google cache that's linked from the SERPs you can use the noarchive meta tag.
https://webcache.googleusercontent.com/
and-that's-all. https://webcache.googleusercontent.com/search?q=cache:mBTWPMwosrkJ:https://example.com/ebooks/title/+&cd=1&hl=en&ct=clnk&gl=usas referer. (The obfuscated part is in fact the page the supporting files belong to.) Off the top of my head I believe a good chunk of 34/8 or 35/8 would be an example of that IP space.I have some parts of both 34 and 35 marked as bad_range, meaning that it can be unset for certain distributed robots. But I don't think googleusercontent is from the specific ranges I block; at least I didn't see anyone getting a 403. The two requests I found in logs were both from ordinary human ISP ranges.