---- page is noindexed, but still shows in SERP with a Google notice
aakk9999 - 12:51 am on Jun 30, 2013 (gmt 0)
The only way the Googlebot is accessing these URLs is from internal pages as we have yet to add rel=nofollow to them...
I think you might have meant: The only way Googlebot is finding about these URLs is from internal pages as we have yet to add rel=nofollow to them...
But I am a bit confused, perhaps you could clarify: 1) Are you saying that you can see that Googlebot is requesting URL which is blocked in robots.txt OR 2) Googlebot is requesting URL that only exists as internal link on the page which you have blocked by robots.txt (and to which you plan to add rel="nofollow") but the target linking page is not blocked via robots.txt?
Because, if it is 1), then Googlebot is not behaving
But if it is 2), then there could be other ways how Googlebot could have found about this URL
Have you tried to use "Fetch as Googlebot" in WMT for the offending URL (the one that is only linked internally, which you think should not be requested by Googlebot, but seeing the requests it in your logs) - does Googlebot fetch it?
Have you tried to put this URL in WMT section "Health --> Blocked URLs --> URLs Specify the URLs and user-agents to test against" and see what Google thinks, whether the crawling is allowed or not? What results do you get?