Page is a not externally linkable
- Search Engines
-- Sitemaps, Meta Data, and robots.txt
---- Google, robots, disallow


enigma1 - 5:44 pm on May 3, 2011 (gmt 0)


but shouldn't Google be adhering to the below rules regardless of the IP or U-A that they use?

No actually and you cannot tell if it was human or bot just because the IP is allocated to google. Robots.txt are "guidelines" and there are ways to force even the popular spiders to go through restricted folders and scripts. They are also various google services regular visitors could use to retrieve stuff from your site (eg translation tools) and even automate them.

One way to get around it - to a certain extend - is setup a cookie and check it on the server end by having a redirect or something along these lines. If no cookie is present don't allow access to these scripts. If they're js files you may have to wrap them with a server script to check the cookie value.


Thread source:: http://www.webmasterworld.com/robots_txt/4307351.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com