Forum Moderators: open
More than a month ago I removed three pages from a clients website because they no longer offer a specific service. As yet google have not removed it from there index and it's still returns when using site:***.****.com or using a search term relevant to those removed pages.
should I include these pages in my robots.txt?
*doh!* they are coming back as HTTP/1.1 200 OK. We have a custom 404 page running on the site and this this may be causing the problem. Is there a way of using a custom 404 page and still returning a 404 status with IIS?
I also read on google that you can include pages you dont want indexed in your robots.txt even if they dont exist on the site.
Is there a way of using a custom 404 page and still returning a 404 status with IIS?
Yes. It either needs to be done at the server level in IIS or you can insert the asp code above the
<html> on the 404 page. Additional Information
<%@ Language=VBScript %>
<%
Response.Status="404 Not Found"
%><html> Insert the above in your 404.asp and you'll be good to go. Always double check that your 301s, 302s and 404s are returning the proper server header status. The above code will only work on an .asp page.
I had one removed in less than 12 hours in november, but then i explicitly asked for it:
[webmasterworld.com...]
>> robots.txt
No, don't do that - the Googlebot needs to be able to spider the page, otherwise it will not see the tag that you've put there:
<meta name="robots" content="noindex">