Forum Moderators: Robert Charlton & goodroi
www.example.com/product is the current well indexed and highly ranked site.
www.example.com/test/product .. if I put rel no follow no index on everything with these URLs that start www.example.com/test/ will that harm my regular site at all or is it fine? I need to make sure.
Thanks very much in advance.
[edited by: tedster at 7:44 pm (utc) on Sep. 19, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]
To exclude volumes of content, and in particular a subdirectory, the Robots Exclusion Protocol is what you need. A robots.txt file can exclude all spiders from your test directory with the content below:
User-agent: *
Disallow: /test/
Naturally, you're also well advised to ensure there are no links to your test content.
Some people (myself included) prefer password-protecting development pages via HTTP authentication - while you can exclude spiders, that doesn't stop your pages being publicly available.
If you're split testing or similar, robots exclusion is the way to go. You can also use a meta robots element if it's only a few pages, e.g. <meta name="robots" content="none">.
So knowing that, what do you advise and how confidant are you that the regualar, real site won't get penalized for having a /Test section of dup pages showing yahoo ppc feeds for example.
Thanks so much.
In addition, the vast majority of pay per click URLs (Yahoo's included) go through a redirect that is it itself robots excluded, so it is quite hard for search engines to even discover URLs used there. They can still pick them up occasionally, but it still won't affect your organic search results (as long the test is conducted on excluded pages).