Forum Moderators: Robert Charlton & goodroi
If I search for bits of this text this site still comes up, no cache and a maintenance mode message if clicked through.
The Maintenance 200 module allows a site to return a Status code of 200 rather than the default 503 (Service Unavailable) code.
"But wait," you ask, "why would I want that? The site is truly in a 503 state and should report that." The reason you'd want to return a 200 is so that your CDN or caching layer (e.g., Varnish) will cache the maintenance page and serve it to new requests rather than passing the request down to your origin server.
Admittedly, this is kind of a double edge sword, since once the page is in cache you'll have to flush your cache to bring the site back up...
My question is how do I completely remove all traces of this in Google's index? Do I have to submit each page in removal request in webmasters console?
- The URL removal tool is not meant to be used for normal site maintenance like this. This is part of the reason why we have a limit there.
- The URL removal tool does not remove URLs from the index, it removes them from our search results. The difference is subtle, but it's a part of the reason why you don't see those submissions affect the indexed URL count.
...if you have the ability to use a 410 for content that's really removed, that's a good practice.
For large-scale site changes like this, I'd recommend:
- don't use the robots.txt
- use a 301 redirect for content that moved
- use a 410 (or 404 if you need to) for URLs that were removed
- make sure that the crawl rate setting is set to "let Google decide" (automatic), so that you don't limit crawling
- use the URL removal tool only for urgent or highly-visibile issues.
The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This (404) status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed.