We are using an Oracle portal implementation so things are a little different. We use google as an internal search engine for our site search. It works really well, except for 1 thing. If the key word is on the side nav bar then all pages with the nav bar are returned. For example a search of jobs will return all correct listing PLUS other articles that have the nav bar because there is an entry of "jobs" in the nav bar. I wanted to include the meta tag but the code for the nav bar is a seperate piece of html that bolts onto the template so the </head> has already passed by the time the nav bar code starts. From what I have read the robots.txt file is unreliable. Any advice would be greatfully received.
Well, this might be a legitimate reason to use cloaking:
Detect the HTTP_User_Agent, and if Google, do not include the nav bar. You'll also want to use the robots-nocache meta-tag to prevent these modified pages from appearing in Google's cache.
It also might be a good idea to leave an html comment in each page briefly explaining why the page served to Google is different from what a user sees, in case you're in a competitive market where a competitor might report you for cloaking... Just a thought.
Robots.txt won't be of much help here, for the reasons you cite and others. Robots.txt works at the URL-level, and not at the "file" level, so it won't help to disallow the files you include in each page on the server side.