I created a copy of my site so that I can make some major changes without causing any downtime. As soon as I get done with it I will replace the current site with this one. But, that might not be for a few weeks. Is there a chance that this 'copy' of my site will be crawled? Ifso, I don't want that to happen...not until it is ready to replace the current site.
If this is a possible scenerio, what should I write in robots.txt to make it so this won't end up happening?
As long as you use that subdirectory, whatever you name it, only to house the new version of your site, it doesn't matter.
Yes, spiders can definitely get into "invisible" subdirectories - I accidentally had a subdirectory called "new" that was a redesign of a large site completely spidered by Google because the google tool bar "told on me".
User-agent: * Disallow: / Should stop spiders that pay attention to the robots.txt file.
I would not recommend the above. Once Googlebot gets a Disallow: / on an entire site, I think it may be a while before you can get a regular crawl. I would definitely create a new sub-directory and Disallow that from the spiders. This way it has no effect on the existing site or spidering of the new site once it goes live.
P.S. mcavill's answer was correct based on your question which was how to prevent all spiders from indexing your site.
In this case though, we only want to prevent them from indexing the new site.