enigma1 - 3:12 pm on Apr 26, 2011 (gmt 0)
Once googlebot finds a non-canonical version of a url, it will continue to crawl it forever
Only if it finds within your site. And that will point to the application from where somehow it gets generated.
There should be no problem with incoming urls even if they include sessions, trackers or any other parameters simply because you never regenerate these parameters and expose them with the HTML. That's my point.
If a user wants to rearrange the parameters it makes no difference because nothing is exposed from your site's pages that constitutes a duplicated link or dup content. You should always parse only the parameters your scripts are aware of and ignore the rest.
If parts of malformed links propagate somehow with the normal links the application generates, then again it's a problem with the application and you need to get to the root of the problem. The robots.txt and rels adjustments won't fix it.
A duplicate problem may exist because the domain itself creates it.
External factors in this case won't change it. If you do a mistake and say now you want to change one page's name to another issue a 301 on the request for the particular page. And it will rectify the problem after a bit, because again the old link is no longer present in your site.