Forum Moderators: Robert Charlton & goodroi
This action leads to the following behaviour:
- After 5 month there are still no pages in the index. When we removed the website it was said that this "will cause a temporary, 90 day removal". Some time later, Google increased this time to "180 days". One would expect that only websites are affected which were removed after they increased this time. However, also our website is affected.
- In the past websites which excluded Googlebot from crawling their pages were still in the index. The results just appeared as URL only entry, but they could be found due to incoming links and their anchor text. However, now the site doesn’t exist any more in the index. Even searching for domainname doesn't bring it up. Also, in the past PR was past to pages from thus websites while now these pages have PR0.
- In the past there was no effect for the directory. However, now the domain was removed from Google's directory.
The consequences of removing the entire website were not only different than expected, one could also use this behaviour to harm other websites (if you have access, e.g. if you want hurt a client site). Just changes the robots.txt and use the automatic URL removal system. Re-change the robots.txt after one or two days. The website will be removed for (at least) a half year and it will be hard to find the reason. More time will be needed until the original situation (all pages are indexed and have PR) is recovered.
To avoid such problems, I would suggest that Google change there policy and reinclude website if the robots.txt is changed back. Also, I would prefer if excluding Googlebot doesn't lead to a remove of the directory entry.
Is not something recommended by the robots.txt [google.com] standard [robotstxt.org]. We have some ancilary evidence that it may be confusing some search engines and causing indexing problems. It's usage is not recommended.
[robotstxt.org...]
I would be surprised if the url only entries did not pop back up in the index.
This was what I expected. In the past one could find URL only entries of sites which banned GoogleBot. (I would be thankful for such a behaviour - I never wanted a "complete remove".) However, all information about this domain are removed - even the directory entry.