simonlondon - 3:32 pm on Sep 2, 2013 (gmt 0)
There are no rules for this. Sometimes this is how it happens, sometimes not. I would guess it depends on other external factors (perhaps on links pointing to page etc).
Otherwise, disallowing the site in robots.txt would have no effect and the site would continue to rank rather than being dropped from index like a stone (a very recent experience).
Further, if the above is the standard behaviour, it would be a heaven for spammers - just create a page, let it be indexed and rank it, then disallow it in robots and put spammy content on it instead and watch it being ranked for the old content.
I think this would be more close to what happens - blocking a page that was previously indexedvia robots.txt may or may not result in this page remaining in index and it may or may not rank equally well after being blocked (or may drop like a stone).
This is very interesting and I have never actually thought about it from this angle. I suppose you are quite right; after disallowed, Google will have no idea what's on the page and essentially the page could even be doing cloaking.
There must be a way for Google to include some kind of metric such as the length of time from the cache/last-crawl date to the current date. As the length increases the page is gradually less able to rank. I don't know because I haven't done any test on this.
But thanks for the comment coz surely I am not going to say the same ever again, with this theory in mind.