Google WMT reports custom 404 page as a soft-404

Virtual server site, all flat-file, static html.

For years I've had a simple 662 byte custom 404 page working fine with no problems, no funny business whatsoever. No meta refreshes, just a search box, and a few words.

Type a duff url for our site, and up pops the custom /404.html page every time.

Today, in our Google Webmaster Tools console, I noticed my first ever "Soft 404" (meaning a page that returns a 200 server response, instead of a genuine 404 page not found server response.) Just the one.

The page Google is showing in crawl errors as a "soft 404" is my custom /404.html page, thus;

Crawl errors: Soft-404
www.mysite.tld/404.html 404-like content May 11, 2011

The 404.html page is of course NOT linked-to anywhere on my site, and it has always had a meta name="robots" content="noindex, noarchive, nofollow" tag to prevent spiders including it.

In my root .htaccess file there's always been the directive:

ErrorDocument 404 /404.html

Additionally, I've always disallowed all bots, via robots.txt, from /404.html
User-agent: *
Disallow: /404.html

So Googlebot should never have crawled that page +directly+, but it did, here's the relevant log entry:

66.249.72.74 - - [11/May/2011:23:42:59 -0400] "GET /robots.txt HTTP/1.1" 200 1221 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.72.74 - - [11/May/2011:23:42:59 -0400] "GET /404.html HTTP/1.1" 200 662 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

It seems Google now expect the true url of a custom 404 page to return a 404 response.

What lunacy is this?

Google WMT reports custom 404 page as a soft-404

Angonasec

g1smd

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week