Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Sitemap crawl reports 404 for pages with 200 header

         

watercrazed

3:45 pm on Mar 13, 2006 (gmt 0)

10+ Year Member



I realize I may just be obsessing here, but after 2 weeks in supplemental hell I feel the need to follow up any possible issue.

I recently did quite abit of work on my site map. Submitted it and got heavily cawrled on the 10th and 11th just checked my sitemap reports in google account and they are reporting 10 to 15 pages with 404 errors. I click on the link posted in the report it takes me to a good page. I checked the header status with a well know tool and it returns a status of 200.

Any Ideas about what is going on here?

Vadim

4:36 am on Mar 14, 2006 (gmt 0)

10+ Year Member



It may be related to the instability of your host (server) or the instability in the communication line between your host and Google. It may be, for example, that if the server is overloaded it either does not respond or the response time is too large.

I have my own micro crawler that I use to check the links on my site. Sometimes it reports the errors of the type "cannot get content" or "cannot get content type". When I check manually or re crawl again I don't see these errors

I suspect that regular browser tries several times to connect and many host providers knows that and do not care if the content is not available from the first try. However Google bot probably would not have time to retry.

I also sometimes see the errors in my site map. However it's usually only one link of very rarely got page (forbidden with robots.txt). I suspect that my host does not cache it properly because it is seldom required and is in robots.txt.

I have noticed that after the Google reportes the error in the sitempap next several week this page is OK and then appears in the errors again.

I actually have very reliable host that is mentioned by Mutt Cuts in [mattcutts.com...] So other hosts may be even less optimized.

Vadim.