Hi All,
Last summer I got beat up a bit (and rightly so) in this thread because my robot wasn't yet checking robots.txt file.
[
webmasterworld.com...]
I shelved the project for a bit, it's back and I now have my crawler first checking robots.txt for permission.
I want to make robots.txt visits separate from the crawl. What's the norm for how long I can appropriately cache the permissions found in a site's robots.txt?
thanks - jeff