shaunm - 7:47 am on Sep 6, 2012 (gmt 0)
And why do I need to keep them off my pages/files when they can simply ignore the robots.txt and index those pages/files in their SERPs through external, internal links to those pages/files?
You use robots.txt to keep Google off your page. It stops them knowing stuff. That's it.
Real-world reasons for employing it include, but are not limited to
- Preserving Crawl budget (CSS files might not need crawling)
- Blocking file directories (/images/)
- Creating bad spider lists (block a directory, link to it in a hidden link, ban anything that finds its way there)
I know I got it wrong, but why don't I get the context yet?