Msg#: 3207594 posted 4:21 pm on Jan 2, 2007 (gmt 0)
I want to remove one of our domains from Google because G has pages indexed (supplemental) that are almost the same for our new site on another domain. I fear that it may be causing a low ranking of the new site. We'll probably be using that domain name within the next year for a new product line.
We've had robots.txt set to not index anything on that site since last spring or summer, but it's still in the G index.
The G removal tool says, "Enter the URL of your page. We will accept your request only if the page no longer exists on the web."
What does "no longer exists on the web" mean? That domain really does exist on the web, that is, you can go to it. Does the statement mean that it won't remove the site if the domain is still in the DNS?
Msg#: 3207594 posted 4:35 pm on Jan 4, 2007 (gmt 0)
There's another option in the url removal tool -- one that asks googlebot to re-spider your robots.txt and remove what is disallowed by the file. I used it twice since November and it was trouble-free for me.
Msg#: 3207594 posted 5:01 pm on Jan 4, 2007 (gmt 0)
Okay, I submitted the robots.txt to be re-spidered by G.
I was just at G Sitemaps and registered a sitemap for that URL with zero urls.
Let's see if these work.
My guess is that they won't remove pages already indexed or cached or in the supplementals. Why? In the past, I had a Sitemap omitting old pages, months later they still existed somewhere in the indexes.
Msg#: 3207594 posted 5:23 pm on Jan 4, 2007 (gmt 0)
The sitemap will not remove urls not found on it, as you supposed. But if you put up a new robots.txt AND used that part of Google's url removal tool that says "Remove pages, subdirectories or images using a robots.txt file." the you are on your way.
This option also allows you to create a dedicated robot.txt file at some other address than the standard root robot.txt -- note "Your robots.txt file need not be in the root directory".
I never tried that, but I can imagine how someone might want it in some situations. However in your situation, you never want bots to spider your dev server, so you probably do want to use the standard robots.txt file to do the url removal.