Forum Moderators: Robert Charlton & goodroi
I want to remove one of our domains from Google because G has pages indexed (supplemental) that are almost the same for our new site on another domain. I fear that it may be causing a low ranking of the new site. We'll probably be using that domain name within the next year for a new product line.
We've had robots.txt set to not index anything on that site since last spring or summer, but it's still in the G index.
The G removal tool says, "Enter the URL of your page. We will accept your request only if the page no longer exists on the web."
What does "no longer exists on the web" mean? That domain really does exist on the web, that is, you can go to it. Does the statement mean that it won't remove the site if the domain is still in the DNS?
If you want to use the robots removal option, then you instead feed Google the URL of a valid robots.txt file for it to process.
In any case, these processes do not properly deal with results that are already tagged as Supplemental in Google's SERP.
I don't understand why G just doesn't obey the robots.txt and sitemaps file. It would certainly make their life easier and the SE up-to-date if they did.
They've been restricted in robots.txt for almost a year, yet refuses to omit those pages.
I'll submit a sitemap to them with nothing in it and see what happens.
Okay, I submitted the robots.txt to be re-spidered by G.
I was just at G Sitemaps and registered a sitemap for that URL with zero urls.
Let's see if these work.
My guess is that they won't remove pages already indexed or cached or in the supplementals. Why? In the past, I had a Sitemap omitting old pages, months later they still existed somewhere in the indexes.
Stay tuned.
This option also allows you to create a dedicated robot.txt file at some other address than the standard root robot.txt -- note "Your robots.txt file need not be in the root directory".
I never tried that, but I can imagine how someone might want it in some situations. However in your situation, you never want bots to spider your dev server, so you probably do want to use the standard robots.txt file to do the url removal.