Welcome to WebmasterWorld Guest from

Forum Moderators: phranque

Message Too Old, No Replies

Custom 404 page cached in search engines

We just re-launched our portal



5:02 pm on Nov 18, 2005 (gmt 0)

10+ Year Member

We just re-launched out portal with completely new file names. Our custom 404 page is served when we have any requests for old files. Now I notice that for example MSN is caching our 404 page! If this continues we will end up having thousands of pages in the indexes of search engines and half of the pages will be identical 404 pages...

Please help me prevent that, we must have done something wrong setting it up. Users are being redirected to the 404 page, is that the correct way to do it?

How does a typical crawer react to a 404 page? And how do they react to a 500 page? Would these 2 responses normally cause the file to be removed from the index?

Very grateful for any help you can give. Have a nice day!


5:07 pm on Nov 18, 2005 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Is the server really returning a 404 not found error, or is it actually returning a 302 found or other header? If the server is returning a 302 or 200 then the spider may well cache the page.

Try the server header check tool [webmasterworld.com] for a non-existent page on your site to see what is happening.


5:51 pm on Nov 18, 2005 (gmt 0)

10+ Year Member

Many thanks for such a quick reply!
This is the response from the header check:

HTTP/1.1 404 Not Found
Content-Length: 6833
Content-Type: text/html
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Fri, 18 Nov 2005 17:45:23 GMT
Connection: keep-alive

So everything looks ok to me. Now, what is the next step in trying to solve this problem?
Thanks again, I am very grateful.


7:13 pm on Nov 20, 2005 (gmt 0)

10+ Year Member


I just hope that anyone else here has an ideas about why search engines cache our custom made 404 page. Ciao!


7:20 pm on Nov 20, 2005 (gmt 0)

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Did you do the server header check on the 404 page? Or, on a non-existent page? The non-existent page is probably returning a 404 and the custom 404 page is probably returning a 200. This is usually the case 8 out of 10 times.


7:48 pm on Nov 20, 2005 (gmt 0)

10+ Year Member

pageoneresults, you are absolutely right! Thanks a lot. So all we need to do is to block robots from indexing the page, just like one does with normal pages?

<META NAME="msnbot" CONTENT="noarchive"> for MSN


<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> Another way to prevent indexing

Plus a robots.txt file of course.

Many thanks!


10:13 am on Nov 27, 2005 (gmt 0)

10+ Year Member

I am back... It seems the search engines igonore the META tags mentioned in my previous post. Instead of seeing a decreased amount of pages in Google, an allinurl query brings back several hundred new pages each day.

If the costum 404 page brings back a 200 in the header check tool, how can we make sure it gives a 404? Users trying to view one of our old pages are re-directed to the custom 404 page. Are there any other ways to server that page than using re-directs?

I cannot understand what we are doing wrong and I am not knowledgeable in the technical details of portal. Any of you have any ideas? Thanks for any advice.


Featured Threads

Hot Threads This Week

Hot Threads This Month