This will not be instant. It will take time to get the pages removed, just like it took time to get them in there in the first place. If you have a properly formated robots.txt file in the root of that sub directory, the pages will disappear from the index over the next few months. However, you will still be able to see the pages if you type in the exact URL via Google. They will just be text links with no description. This is how the removal normally works.
Hi Mark -
But in the past I've had good results with the URL exclusion tool. It takes a few days and the pages disappear as long as I've excluded them in the robots.txt. Changing robots and *waiting* would result in months of time but the URL exclusion acts quickly, but in this case did not follow the instructions of robots.txt
Robots.txt should always be in the root folder the website as per the standard. robots.txt files in other directories will not be requested by bots/spiders.
Also you haven't given an example of the exclussion you have in your robots.txt file, but I suspect that you haven't done them correctly.
They should be in the format of
I suspect you probably have something like
which in not valid, as all items have to start with /
Also you seem to be confusing subdomain and subdirectory they are not one and the same thing and this could be part of your problem.
Do you see the robots.txt file when you enter
[edited by: ThomasB at 10:09 pm (utc) on Sep. 20, 2005]
[edit reason] examplified [/edit]
Hi D -
We are trying to delete all content indexed in this subdir:
We placed the robots.txt in the subdomain here:
The robots.txt has only these two lines of text:
[edited by: ThomasB at 10:10 pm (utc) on Sep. 20, 2005]
[edit reason] examplified [/edit]
Your robots.txt is correct.
Maybe it just takes a little bit of time for it to take effect.
Give it a few days, and if your pages are still listed by Google, then contact them.
The problem is that using URL exclusion it should take only a few days, then Google follows the new robots.txt instructions, you get a removal "complete" message, and the pages are gone. I've used it many times for subdirectories, but never on subdomain as here.
I got the "complete" message and they are not gone which means the robots.txt is not getting followed correctly.
I'm sure you're aware that Google is not one big computer. I think the last I heard, it was 170,000 computers distributed all over the world. As a result, it takes time to 'roll out' updates to all of these machines.
This is the same cause for the hundreds of posts we see when an update is occurring (or even rumored) that "My site is in one minute and out the next -- I'm worried!" -- The reason is that with load-sharing and round-robin DNS, you never know just what server a google domain name will resolve to; One minute you connect to one machine, the next minute, to an entirely different one. And updating them takes time.
The fact that this is a subdomain hosted in a subdirectory should have nothing to do with it. After all, "www.domain.com" is a subdomain of example.com, and there are many "www" sites that seem to work just fine... :)
I'd give this a few more days, and then see where you stand.
Thx Jim and everybody...
The listings are gone today so I'm happy.
I still think it was odd that it took two requests and about 2 weeks for the URLs to disappear but I'm happy now.
Also, appears some aspects of the 'duplicate content" filter are now gone though our google traffic has not come back noticeably.