Welcome to WebmasterWorld Guest from 184.108.40.206
joined:June 2, 2019
I would like to test things before Google can index it. Is the best way to achieve this by putting the following into robots.txt?
Would the site get penalized by Google in any way by disallowing indexing in the beginning?
I can't think of any subject for a website that could legitimately command so many pages.
Law-abiding robots--notably most search engines--will honor the Disallow, and the only thing better than a blocked request is a request that isn't made in the first place.
I don't think you are suggesting that OP should allow search engines to crawl
A robotted page can still be indexed if linked to from from (sic) other sites
While Google won't crawl or index the content blocked by robots.txt, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results. To properly prevent your URL from appearing in Google Search results, you should password-protect the files on your server or use the noindex meta tag or response header (or remove the page entirely).
[edited by: phranque at 8:49 am (utc) on Jul 29, 2019]
joined:July 25, 2019