Forum Moderators: open
User-agent: Googlebot
User-agent: Slurp
User-agent: Teoma
User-agent: twiceler
Disallow: /cgi-binUser-agent: *
Disallow: /
However, Twiceler apparently couldn't parse that, and as a result believed that it was disallowed by the second policy record in the example above.
Twiceler seems to have started correctly parsing multiple-user-agent policy records as of November 18th, 2009. I noticed that it no longer just 'went away' after fetching robots.txt (in which it is allowed) on that date.
No change to the Twiceler user-agent string ( "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)" ) was apparent when this behavior change was noted.
It appears that the change was backed out on December 22nd 2010, as Twiceler reverted to its former "fetch robots.txt and leave behavior, but then on January 22nd, 2010, it began fetching pages from my site again, and some of those pages started to appear in the Cuil.com search results (albeit with the common mis-ascribed "sample images").
I'm not sure what they're up to over there, but I take this as a welcome improvement and a sign of life at Cuil -- at least they're evidently working on their infrastructure.
Jim