homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Webmaster Tools shows home page 404, but it is available

 7:43 am on Jun 21, 2008 (gmt 0)

As title suggests, in my GWT account, under 'Errors for URLs in Sitemaps', it shows my home page as 404 not found?

However, the site is all working fine and using the header check shows the following:

HTTP/1.1 200 OK
Date: Sat, 21 Jun 2008 07:43:18 GMT
Server: Apache/1.3.41
Last-Modified: Sun, 04 May 2008 22:12:25 GMT
ETag: "f0785b-d74-481e34c9"
Content-Length: 3444
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html
X-Pad: avoid browser bug

Anyone know why this could be happening?


Receptional Andy

 10:32 am on Jun 21, 2008 (gmt 0)

Seems a bit strange. Has your site or hosting undergone any maintenance recently?


 10:33 am on Jun 21, 2008 (gmt 0)

I know they had an issue with php pages not returning 404 responses when they should, however, that's now been fixed. Other than that, nothing has happened on my site for a long time.

Receptional Andy

 10:51 am on Jun 21, 2008 (gmt 0)

When you first login to Webmaster Tools, what date do you have in the section which reads "Googlebot last successfully accessed your home page on..."? Is that date before or after the sitemap error?


 11:15 am on Jun 21, 2008 (gmt 0)

Last accessed home page June 15

Error for 404 shows June 18?

Receptional Andy

 11:19 am on Jun 21, 2008 (gmt 0)

It seems to me that the two most likely options are a problem with your site when Google visited, or a typo or similar in the sitemap. It would be worthwhile checking the server logs to see what googlebot's actual activity was (and the specific request that caused a 404), and to see whether this affected any other visitors.


 11:39 am on Jun 21, 2008 (gmt 0)

Thanks Andy - will have a look into this.

Just a quick question: is this problem likely to affect my SERPS? I am worried that if Google thinks the page doesn't exist (which it definately does), will it drop the pages?

Receptional Andy

 11:41 am on Jun 21, 2008 (gmt 0)

Google is pretty 'loose' with it's interpretation of server errors - it will keep trying for some significant time to guard against temporary errors. However, if it repeatedly gets a 404, it will drop the page eventually, so I would try get to the bottom of this problem ASAP.


 11:47 am on Jun 21, 2008 (gmt 0)

Could it even be a case of Google having a temporary 'glitch'?

Receptional Andy

 12:05 pm on Jun 21, 2008 (gmt 0)

Google having a temporary 'glitch'?

We're talking IT here, so anything's possible ;)

However it is highly unlikely in this instance - a 404 is a response from your server. Your server has supplied this information to Google, otherwise the error would be different.

Looking at the exact request by Google and your server's response in logs should show where the problem lies, in any case.


 12:14 pm on Jun 21, 2008 (gmt 0)

The old /index.asp /index.html /index.php bug recurs? Check your log files (access log) and filter for requests by Google on the 18th. Inspect exactly what was requested in order to show a 404.


 1:36 pm on Jun 21, 2008 (gmt 0)

I contacted my hosts who have told me that June 18 there was a server outage for a few hours and because of this, my site was down for a few hours.

Could this be what caused the problem? If so, should the 404 response eventually sort itself out on Google Webmaster Tools?


 9:39 pm on Jun 21, 2008 (gmt 0)

It may take a week or two for the error message to sort itself out, as the data is not updated all that often in WMT.

Receptional Andy

 10:00 pm on Jun 21, 2008 (gmt 0)

If your site is unavailable, Google will be unable to connect, and will get an error (I think they call it 'network unreachable').

a server outage

IMO this seems like an inadequate response from a hosting company. There are very few 'server outages' that will cause a server to issue a 404 response.

Again: check your server logs, which will tell you a clearer picture of what happened to your homepage.


 10:24 pm on Jun 21, 2008 (gmt 0)

Lee, you're not alone. I also have the same problem and interestingly enough, it also happened on the 18th. The homepage is a static html page served on a dedicated server and the 404 error does not show up in the server log files for anybody, much less Googlebot. I have failover service so if there was an outage I would know about it immediately.

The first homepage 404 error occured on the 13th. Now it shows the 18th. Indexed urls dropped from roughly 24000 to a little over 4000. In my case, it hasn't effected search listings or rankings (yet), however, something is going on that needs attention.

[edited by: Key_Master at 10:25 pm (utc) on June 21, 2008]


 11:31 pm on Jul 19, 2008 (gmt 0)

However it is highly unlikely in this instance - a 404 is a response from your server. Your server has supplied this information to Google, otherwise the error would be different.

Maybe the error is a Google bug . Have a look here : [webmasterworld.com...]


 2:39 am on Jul 20, 2008 (gmt 0)

404 error does not show up in the server log files for anybody, much less Googlebot.

That sounds odd to me - the web is chock full of bad requests. Are you sure your server is actually logging 404 errors?


 3:11 am on Jul 20, 2008 (gmt 0)

Bad requests for pages that do not exist are one thing, 404 errors for valid, static pages (like the home page) are another thing. Anyhow, there are dozens of similar reports in Google Groups. Google is aware of the problem and presumably working on it.


 4:35 am on Jul 20, 2008 (gmt 0)

Just to add yet another data point, GWT puked all over one of my big sites, too. Lots of "Not Found" errors reported in GWT, no 404s logged on the server except for those caused by script-exploit 'bots looking for rpc-this-and-that.dll, and the server has been solid for months.

This is apparently just a break in the 'data feed' into GWT itself, because none of the pages --404 or otherwise-- have budged in the SERPs, and no other ill effects have been noted. This is true for both the main domain (HTML), and a mobile-device-specific site on a subdomain (XHTML+XML/Mobile), so two different crawlers (Googlebot, and Googlebot-Mobile) would have to be broken if it were a true crawler issue.

My GWT report updated just a few hours ago, and it looks like ~80% of these bogus errors were cleared.

I'm basically ignoring the "Not Found" report until they fix it... But it is rather annoying.



 4:50 am on Jul 20, 2008 (gmt 0)

Hey Jim, check your logs. I'm curious to see if Googlebot-Mobile is following 301 redirects on your site.

I think it's Googlebot-Mobile that is broken. It doesn't seem to following 301 redirects. Maybe it's mistaking 301's for 404's. I think standard Googlebot is working fine, however, if it doesn't recrawl pages that Googlebot-Mobile thinks are 404's they will drop from the index.


 5:20 am on Jul 20, 2008 (gmt 0)

Hmmm... Well, there are no 301 redirects implemented on the mobile site, since it's on a single subdomain and was designed only two years ago -- after I'd already made most of my URL-architecture mistakes. :)

And I don't see any 301 redirects encountered by Googlebot-Mobile on the main (non-mobile) domain either, since "wrong" links to the site are very few and far between (The main site was 301-canonicalized at birth, so it's rare to ever encounter a non-canonical link).

So... I have no data, except that both sites are showing bogus "Not Found" errors in GWT, and neither seems to have fed any Googlebot a 301 (or a 404, or a 500) in the past month. It's been all 200 and 304 responses here... Frankly, I think maybe a fail-to-connect-to-database error in the GWT 'evaluator' must be handled like a crawled 404 or something... ;)



 5:39 am on Jul 20, 2008 (gmt 0)

It's not confined to GWT- I can confirm it's dropping some URLs from the index. The 404 errors seem to go away when the URLs are reindexed by standard Googlebot.

High level pages are much less likely to be dropped from the index due to this 404 issue but deeper level pages with less page rank are being scrubbed from the index.

The main site was 301-canonicalized at birth, so it's rare to ever encounter a non-canonical link

Lucky you :) I have dmoz listings pointing to the www.example.com and the example.com domains so no such luck for me.


 2:17 pm on Jul 20, 2008 (gmt 0)

I'll keep an eye on the logs for this behavior, but with limited 301s I'm not sure I'll be able to collect any useful info on what you're seeing.

> dmoz listings

Hmm... Plural... I'd say Lucky you! ;)



 3:38 pm on Jul 20, 2008 (gmt 0)

Since you have limited 301 redirects in your logs, can you connect the 304's to same files that show 404 errors in GWT?

Also, does anybody else see 404 errors for valid pages also showing up in the Site Diagnostic feature in Google Adsense?


 4:46 pm on Jul 20, 2008 (gmt 0)

I have a URL that shows as a 404 Error in the Crawl Errors report in WMT. The 404 URL is a simple typo on some other site, so it is pointing to the wrong URL for the required page of content.

The site then had a 301 redirect added a few days after that error message appeared in WMT. This was to catch any traffic following the duff link and redirect it to the correct page.

A month later, and WMT continues to show that URL as a 404 Error in the Crawl Errors report, even after the page that contains the duff link on, has been crawled again.

Longer discussion is in: [webmasterworld.com...]


 5:51 pm on Jul 21, 2008 (gmt 0)

I have a page that's showing up with a 4xx error as of this weekend - won't even tell me what the error code is so I can diagnose it.

I had the 404 homepage error last week when this thread was started but it has since went away. Oops this thread was started last MONTH...

[edited by: BradleyT at 5:52 pm (utc) on July 21, 2008]


 6:18 pm on Jul 21, 2008 (gmt 0)

Happens to me all the time. it they do work just ignore it. The bots will come around again and fix that...

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved