Forum Moderators: Robert Charlton & goodroi
For example, my site is www.widget.com. We set up our testing grounds as http://development.example.com.
Now, when I do site command, I see some pages from the regular www directory AND some from the development directory.
I'm assuming this is an issue, but I can't redirect our development pages to our www pages as they are separate landing pages.
Any guidance here?
[edited by: tedster at 9:30 pm (utc) on May 30, 2006]
[edit reason] use example.com [/edit]
It is still in effect and is approaching 6 months. Filing reinclusion requests, emails to G, etc. have done absolutely no good. The duplicate content penalty appears to be automatic (for only 6 months hopefully).
So immediately, disallow the dev site in your robots.txt file and also put up noindex, nofollow metas. And pray that it is not too late. Unfortunately, it very well may be, based on my personal experience.
Send me a sticky mail once things sort out.
So immediately, disallow the dev site in your robots.txt file and also put up noindex, nofollow metas. And pray that it is not too late. Unfortunately, it very well may be, based on my personal experience.
Unfortunately the above solution will not work. Avoid using the robots.txt file and drop a Robots META Tag on those pages you don't want indexed right after the opening <head>...
<head>
<meta name="robots" content="none"> or, the long version...
<head>
<meta name="robots" content="noindex, nofollow"> You can also put the /dev/ into a password protected environment. ;)
why would that solution not work?
In short, when you include an instruction in your robots.txt file to Disallow: a particular page and/or directory, Google will index the URI only.
Removing that instruction and then using the Robots META Tag allows Googlebot to read an instruction that is more specific, in this case, noindex and no follow which keeps the page (and URI only listing) out of their index.