|updated all my Soft 404's in WMT to 404's. Got a warning message.|
Did I do something wrong?
I found abotu 400 or so soft 404s in my WMT and I had my developer write a code that would turn all the soft 404's (which were simply returnign a 200 code and redirecting to a 'catch all' page) into 404's to tell google that these pages are gone now and not to come back and crawl them again.
I got a message in my WMT this monrning that reads:
Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages.
is this what normally happens? or have i made some critical error? We followed the simple instructions provide in a link through WMT about how to handle soft 404's correctly. I feel we did it correctly, just wanted a second opinion from someone who's done this recently.
You're fine. As with so many crawl "errors", g### is simply telling you that it found something that you may or may not have intended.
But do go back to your developer and have them replace the former 200 rewrite (or was it a 30x redirect?) with a 410. It sounds as if they just deleted a line of code instead.
Using 410 instead of 404 sends the message that you know the pages are missing, they're supposed to be missing, you removed them on purpose, and in a few years google will stop asking for them.
Google claims 404's can't hurt your site
That's weird. My 404s went from 700-something to 9000-something in few hours because of bad programming. I received no messages. After few months I decided to turn 404s into 410s... It worked previously, so I hope it will work now too to remove those annoying errors from WM.
What I understood so far, 404 is a sensitive thing. If at some point somewhere was a link to it, even for few seconds(my case), then you should not let 404s to be there for a long time. If the page is gone return 410, if the url is incorrect 301 it, or just fix it if it's a technical thing. But don't let 404s to just stay there.
It confuses the bot, all it can understand is that it's not there, but what happened to it, is it gone, is it gonna come back, is it gonna redirect to something else?
So if at some point the bot visited a 404 page by following a link, then you should take actions.
The "error message" appears when a large number of pages suddenly change status in a short time scale. As the change is what you intended, you can either ignore it or you change the 404 status to 410 for those URLs.
I often use a simple PHP script to detect when bots request old pages that now return 404 and update a database entry for the URL such that those URLs then return 410 Gone for all future requests.
When Google finds a URL that returns 404, they test it once or twice more in the next 24 to 48 hours then do not revisit it for weeks or months. You have to be very quick to change the status from 404 to 410 (or whatever it should be) in order for Google to see it.
I only ever return 410 for URLs that used to exist, i.e. URLs that have returned content with 200 OK status at some time in the past. If the requested URL has never existed, I leave it returning 404 status forever.
You will occasionally get that message in GWT when your server is down for an hour or more, depending on how much traffic is "normal" for your site. It seems to be an automated message that is triggered when x number of pages are 404 over x amount of time in comparison with the norm for your site.
I've received this message and, in my case, there was no loss in traffic when the server was brought back online (4 hours).
thanks for the response everyone. Just got another of the same messages, the count keeps going up.
I won't worry about it right now, google will likely find many more. Im confident they're handled correctly at this point. although I do suppose the more correct response code would be 410, I think we'll leave it at 404 for now as I am adverse to change after change after change.
On a note, our crawl rate had an increase of about 4x the usual daily rate. not sure if it's in direct correlation with this, but I'm assuming so for now, unless I see a lot of reports on others experiencing a crawl increase due to an upcoming update.
I've worked on a site with 1500 valid URLs returning 200 OK status and real content, more than 7000 URLs for 1000 deleted products that all return 410 Gone, about 60 000 ex-duplicate-content-URLs and old URLs-with-paramters that all now 301 redirect to the matching new content URLs, and about 700 never-existed URLs (reported in WMT) that return 404 Not Found.
Once the technical stuff was fixed, it took Google six months to figure it all out and they seem quite happy with it all now.