-- Search Engine Spider and User Agent Identification
---- Stopping scrapers from the get-go
encyclo - 2:31 am on Feb 16, 2011 (gmt 0)
What's the best way to tell the bots that a page hasn't changed, thus no need to crawl? etags? I think that stuff requires I change the page headers, and that's tough to do with static html pages.
I'll leave others to reply for the other items, but if you are dealing with static content, then Apache should be already set up to handle this situation automatically (assigning ETags, sending 304 Not Modified responses, etc.). You can also use .htaccess directives to define more specific expiry times if needed.