Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- And Now Google's Doing It. JS Stats Show GoogleBot


TheMadScientist - 12:05 am on May 15, 2011 (gmt 0)


There are worse bots to worry about - ones that offer no potential benefits.

I wouldn't have noticed or even cared if they weren't totally messing up my stats ... It's not that I care about the page in the index, or visits to the page; what I care about is the stats having some semblance of accuracy, and I would rather not have to 'code around' companies that CLAIM to run compliant bots, and do SEEM to when they put 'bot' in the UA string, but otherwise just do what suits them.

If they just came out and said: Hey, GoogleBot is compliant, but not all of our bots are (what is a preview fetcher if it's not a bot?), so make sure you account for them in your stat keeping ... They don't even send an X-Forwarded-For to indicate it's not them specifically making the request, so for all I know they may be 'pre-fetching' the page when someone with previews on visits a SERP with the page they're fetching included, or even when someone clicks on a preview above / below and their data shows that person is 'more likely' to click on the one they fetched so they get it early ... IDK how they decide which page to fetch or when and I really don't want to take the time to research it and find out, because I have plenty of other things I could be doing, and I don't even really know how to count a 'preview' except as a preview and who knows WTF that means, except Google decided they wanted to get the information on the page for a preview, which could mean it's used or maybe cached and used in the future, or may be for internal use only if no one actually clicks on the preview, but they pre-fetch on some occasion, and there are a HUGE number of 'non-visits' in the JS stats now, so I have to rework either the stat system or the information displayed to work around their interpretation of what defines a bot, robots exclusion protocol, and web standards compliance, which is obviously a bit different than mine...

I EXPECT rogue bots to do stuff like this, but not companies like Google where they are actually fairly honest (brazen? lol) about what they're doing and actually seem to follow standards and protocol for the most part ... It's the WTF? are they doing in here when IMO I should not have to worry about them in there that got me.

ADDED:
And why would it reverse to a GoogleBot IP Address if it's not a bot?
And either way, why not give it a range of IPs and call it what it is for a reverse look up?


Thread source:: http://www.webmasterworld.com/search_engine_spiders/4312058.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com