They need an info page to tell site owners & webmasters: 1.) Who they are? 2.) What they're after at our sites? 3.) What they will do with the data they retrieve? 4.) Why we should allow them to take our property. How does it benefit us?
lucy24
7:56 pm on Aug 28, 2017 (gmt 0)
To refresh my memory: In these posts, "robots.txt" means only that they asked for it, not necessarily that they honor its contents, or even ask before their first page request?
keyplyr
8:16 pm on Aug 28, 2017 (gmt 0)
Yes, robots.txt was requested, period.
As far as User Agent Documentation, IMO anything further would get into vague scenarios and interpretations. So whether the bot followed the intentions of the robots.txt directives or not, would be on a case by case validation.
This is why I personally think the robots.txt is outdated. The web has moved on and most agents have no use for it since it doesn't apply to them. It was never a standard despite the efforts toward making it that.