For example, if you changed some filenaming conventions, you can use this technique to steer your traffic from the old names to your new names. Meanwhile, you change your links to use the new names. Then you hope that someday the bots will get tired of the old names and start asking for the new names directly. (Yeah, sure, don't hold your breath. In my experience it will take years.)
By the way, I used a doorway like this written in 'C' that was redirecting my entire site. (I sold one domain and needed them to redirect my specific filenames to my other domain, so I stipulated in the contract that they use my program for a few months.) In C it didn't even make a dent in my load when I ran tests on it.
1. Delete your robots.txt
2. You put an ErrorDocument 404 in your htaccess that redirects to a script.
3. The script picks up the REDIRECT_URL from the environment table.
4. If the request was for robots.txt, you consult your black list or white list. If it wasn't, you issue your usual custom 404 page blurb. You will probably have to issue a Status: 404 Not Found on a line just before you issue Content-type: text/html plus a blank line. Only then should you send your blurb. Check all your headers after you've finished; your mileage may vary with the headers sent out by different versions of Apache, etc.
5. If it was for robots.txt, then issue a Content-type: text/plain and the appropriate robots.txt lines that you want that particular bot to see. Apache probably sends out the Status: 200 by itself in this case. As far as I can tell, if it is done properly (be sure to compare and contrast your headers when you've finished!), then there is no clue in the headers that your robots.txt was not a static file.
If you are flooded by bots from Korea or Japan or China that actually look for robots.txt, and you know that you never get any decent traffic from those places anyway, then I can see the value of doing this. But it also discriminates against good, little bots that are trying to do the right thing, and therefore it simply perpetuates the monopoly of the big bots that are on your white list. Is there any such thing as a "good, little bot" these days, or am I hallucinating because it's too late for the good guys?
Question: How many of the dudes who use personal bots are in the habit of asking for robots.txt? I guess the better way of asking this question is, "Of all the personal bots available, how many of them check robots.txt by default?" I know you can defeat this check on almost all of them, but I'm curious about the default settings.