What follows is a "back to the basics" on getting good rankings
"Having a robots.txt to include the pages that you want the search engines to include"
My understanding is that you can ban non desired bots and those what you don't even mention will crawl anyway. What I read there is different, "include those you want to crawl your site". Is that necessary really?
Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".
robots.txt is usefull if you don't want bots requesting certain URL's. Reason for banning bots requesting certain resources include 1) That they would use up to much bandwidth e.g. images 2) That they would cause some problems in your logging, tracking or counting of users 3) Has contents you don't want indexed (Note: robots.txt is not ideal for this use, as they can still index the URL) 4) Would cause events to be trigered that you don't wan't (ie. CGI script calls, shopping baskets etc.)
i my opinion the only use i see for this robots.txt is if you are an online drug dealer or firearm dealer or whatever in this area and you don't need to show up on google as you have your own buyer network - in this case robots.txt is pure gold.