Msg#: 4307351 posted 3:00 pm on May 3, 2011 (gmt 0)
One entry in my robots.txt is shown below, yet today, two Google IP addresses (22.214.171.124, 126.96.36.199) visited [u]only[/u] two files in a directory named /jscript/ .
There was no Google-related U-A in the log entry (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:188.8.131.52) Gecko/20060909 Firefox/184.108.40.206), but shouldn't Google be adhering to the below rules regardless of the IP or U-A that they use?
Msg#: 4307351 posted 5:44 pm on May 3, 2011 (gmt 0)
but shouldn't Google be adhering to the below rules regardless of the IP or U-A that they use?
No actually and you cannot tell if it was human or bot just because the IP is allocated to google. Robots.txt are "guidelines" and there are ways to force even the popular spiders to go through restricted folders and scripts. They are also various google services regular visitors could use to retrieve stuff from your site (eg translation tools) and even automate them.
One way to get around it - to a certain extend - is setup a cookie and check it on the server end by having a redirect or something along these lines. If no cookie is present don't allow access to these scripts. If they're js files you may have to wrap them with a server script to check the cookie value.