Forum Moderators: goodroi
We have a subdirectory of form subdirectory.oursite.com
Google has indexed many dynamic URLs of the form subdir.oursite.com and we want to get rid of all of these from the index.
We put a Googlebot disallow in robots.txt and place it in the subdirectory.
Then we submitted to Google exclusion.
We got a message "completed" by Google, but the pages are still all there!
But in the past I've had good results with the URL exclusion tool. It takes a few days and the pages disappear as long as I've excluded them in the robots.txt. Changing robots and *waiting* would result in months of time but the URL exclusion acts quickly, but in this case did not follow the instructions of robots.txt
Also you haven't given an example of the exclussion you have in your robots.txt file, but I suspect that you haven't done them correctly.
They should be in the format of
disallow: /directory/startofthingtoexclude
I suspect you probably have something like
disallow: startofthingtoexclude
which in not valid, as all items have to start with /
Also you seem to be confusing subdomain and subdirectory they are not one and the same thing and this could be part of your problem.
Do you see the robots.txt file when you enter
[subdirectory.example.com...]
[edited by: ThomasB at 10:09 pm (utc) on Sep. 20, 2005]
[edit reason] examplified [/edit]
We are trying to delete all content indexed in this subdir:
events.oursite.com
We placed the robots.txt in the subdomain here:
[events.example.com...]
The robots.txt has only these two lines of text:
User-agent: Googlebot
Disallow: /
[edited by: ThomasB at 10:10 pm (utc) on Sep. 20, 2005]
[edit reason] examplified [/edit]
I got the "complete" message and they are not gone which means the robots.txt is not getting followed correctly.
This is the same cause for the hundreds of posts we see when an update is occurring (or even rumored) that "My site is in one minute and out the next -- I'm worried!" -- The reason is that with load-sharing and round-robin DNS, you never know just what server a google domain name will resolve to; One minute you connect to one machine, the next minute, to an entirely different one. And updating them takes time.
The fact that this is a subdomain hosted in a subdirectory should have nothing to do with it. After all, "www.domain.com" is a subdomain of example.com, and there are many "www" sites that seem to work just fine... :)
I'd give this a few more days, and then see where you stand.
Jim
The listings are gone today so I'm happy.
I still think it was odd that it took two requests and about 2 weeks for the URLs to disappear but I'm happy now.
Also, appears some aspects of the 'duplicate content" filter are now gone though our google traffic has not come back noticeably.