To Bot Runners
If you are using a robot, bot, spider, crawler, etc to gather data from websites, it is highly recommended (and proper netiquette) to inform the site owner of certain information.
An acceptable User Agent (UA) string would look something like this:
When creating your UA string, it is important to include a link to your robot info page. To verify ownership, this information page should be hosted on your TLD, the same domain as you do business at.
The robot info page should explicitly describe:
• Who you are?
• Does your bot support robots.txt?
• What UA should be used in robots.txt?
• What is your verified crawl IP range(s)?
• What data do you want at our websites?
• What you will do with the data you retrieve?
• Why we should allow you access to our property? How will it benefit the content owner?
If these things are not clearly presented to site owners, you risk being blocked by default.