phranque - 1:09 am on Jul 24, 2012 (gmt 0)
the best way to stop a bot from crawling AND indexing a requested resource is to respond with a 401 Unauthorized status code.
10.4.2 401 Unauthorized
The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. The client MAY repeat the request with a suitable Authorization header field (section 14.8). If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user SHOULD be presented the entity that was given in the response, since that entity might include relevant diagnostic information. HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication"
HTTP Authentication: Basic and Digest Access Authentication:
here's the Authentication and Authorization How-To for the Apache HTTP Server:
and the Windows Server documentation to Configure Basic Authentication (IIS 7):
Basic Authentication <basicAuthentication> : Configuration Reference : The Official Microsoft IIS Site: