homepage Welcome to WebmasterWorld Guest from 54.204.231.253
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Purposely returning 200 for Page Not Found instead of 404
dstiles




msg:3872474
 5:15 pm on Mar 17, 2009 (gmt 0)

The intention here is to help customers who may have mis-spelt the filename or hit an out of date page.

Instead of a 404 the site returns 200 (or in the case of a recent test 301 since the page was in cache).

Google WMT is complaining about this. Will this affect the SERP rating in any way? I don't see why it should as it's for the customer's benefit, not google's. I can falsify the return for google if necessary but curious as to whether I need to.

 

tedster




msg:3872627
 7:34 pm on Mar 17, 2009 (gmt 0)

This practice, over time, can destroy your site's rankings. You are essentially telling Google to index the same content for an infinite number of URLs.

You can return a custom error page that is friendly to your visitors and still send a 404 in the server's http header.

dstiles




msg:3872812
 11:23 pm on Mar 17, 2009 (gmt 0)

Stupid of me! I'd forgotten the canonical aspect. Thanks.

rainborick




msg:3872829
 11:59 pm on Mar 17, 2009 (gmt 0)

I've found that Google's custom 404 widget is actually pretty good at suggesting the proper URL in the case of file name misspellings.

jdMorgan




msg:3872863
 1:31 am on Mar 18, 2009 (gmt 0)

Sticking with strict server response code semantics, you can return either a 404-Not Found or a 301-Moved Permanently redirect to the correct URL in response to a mis-spelled type-in URL.

(To avoid confusion, I believe that the "301" mentioned in the initial post was actually a 304-Not Modified, based on the "already in cache" qualification in that sentence.)

Jim

dstiles




msg:3872880
 2:32 am on Mar 18, 2009 (gmt 0)

You can only return the correct page if you can pre-guess all the spelling combinations. :)

In fact it IS a 301. I posted in a bit of a hurry: it should NOT be a 200 despite what google says it's seeing.

The 404 handler does its best to track down the requested file (replacing extensions mainly) and if that fails it goes for the home page. This technique was always good enough before but now google seems to be second-guessing what a 404 should be (presumably sending stupid URLs to see what comes back).

Thinking back to my earlier "stupid me" response, there is only a canonical issue if google's robots request invalid URLs. On this site I KNOW there aren't any removed pages in their index 'cause we haven't removed any, so they should not be concerned about such issues.

Shaddows




msg:3873048
 9:20 am on Mar 18, 2009 (gmt 0)

Well, if I were your competitior, I would be pointing as many incorrect links at your site as I could. So many duplicate pages (literally as many as I could get 'discovered') would make you look spammy, and kill all those 'hidden' ranking factors like Trust and Authority.

JS_Harris




msg:3873174
 12:13 pm on Mar 18, 2009 (gmt 0)

I'd like you as a competitor, while you scheme and plot against me I ignore you and work on great content (hint hint).

When I notice an issue I fix it and move on, your efforts would then be wasted. :-)

Shaddows




msg:3873181
 12:25 pm on Mar 18, 2009 (gmt 0)

When I notice an issue I fix it and move on

Yep, but the problem is that any invalid URL returns an identical page with a 200 header status.

That is the problem needing fixing. Fix it and my suggestion is moot. Which was actually the point I was making. Infact, the OP in his previous post says:
here is only a canonical issue if google's robots request invalid URLs. On this site I KNOW there aren't any removed pages in their index 'cause we haven't removed any, so they should not be concerned about such issues

I was trying to point out that 404s are supposed to be served when resourse does not exist. Instead, every conceivable URL is returning a duplicate content 200.

I fully agree that the problem should be fixed and the OP move on. I was pointing out the problem should he just carry on as is.

dstiles




msg:3873729
 9:14 pm on Mar 18, 2009 (gmt 0)

Contrary to what I said in the thread's title, it's 301 not 200, as I noted here yesterday, so any dead links are redirected permanently to the home page.

g1smd




msg:3873795
 10:22 pm on Mar 18, 2009 (gmt 0)

Does it do that for "only links that used to work but no longer do", or for "all URLs that do not exist, even if they have never existed"?

If the latter, then that may well cause you problems. Search engines sometimes request random made-up URLs to test your 404 response. If those redirect, that may well confuse them.

suzukik




msg:3875406
 7:11 pm on Mar 20, 2009 (gmt 0)

Instead of a 404 the site returns 200 Tags for SEO

They are called "soft 404s". Google do not recommend to use them because:
they can be a confusing experience for users and search engines.

[googlewebmastercentral.blogspot.com...]

dstiles




msg:3875579
 11:22 pm on Mar 20, 2009 (gmt 0)

g1smd - as far as I know there is no actual problem except that WMT notes that what google thinks should be a 404 returns a 200. I assume they test this using impossible URLs that they expect to return 404. Which frankly is no concern of theirs since the site is designed to retain customers not to play catch with google (yeah, I know!).

The fact that they are interpreting 301 as 200 is the interesting part. If they didn't send duff URLs in the first place it wouldn't do that anyway: it's for humans not dumb machines.

suzukik - as noted, it was actually a 301 not a 200, although the 200 obviously followed on from the successful 301.

Not sure why google thinks it's confusing to punters, since the punter has (generally) mis-typed a page name and gets the site s/he wants with a menu from which they can choose the correct page. Helpful rather than confusing, since otherwise they'd probably get the basic "That didn't work, what did you do wrong?" type of message, which IS confusing (which bit did I get wrong) and, to me, also VERY annoying.

What I read from your google URL is that google thinks it's confusing to their robot. So stop sending duff URL.

It is entirely probable that the webmaster who sets up a 301 redirect to the home page for a duff page request is the kind of person who will ensure removed pages are treated with appropriate redirects, hence helping the visitor. People who do not do that almost certainly don't have a clue about setting up 301's in the first place so they just issue 404s.

g1smd




msg:3875582
 11:31 pm on Mar 20, 2009 (gmt 0)

If the URL is incorrect, a 404 response should be returned in the HTTP header. That's in the HTTP specs.

You can show whatever content you want for that request, but the HTTP Status Code should be 404.

That is, there should not be a 3xx-numbered redirect returned for such a request.

If you play fast and loose with the HTTP specs, don't be surprised if user agents that do follow the specs get confused by your site and take whatever damage control (damage to their systems in terms of avoiding sites that appear to be bot traps or have infinite duplicate content) they feel like taking.

fishfinger




msg:3875805
 10:17 am on Mar 21, 2009 (gmt 0)

If they didn't send duff URLs in the first place it wouldn't do that anyway: it's for humans not dumb machines.

Humans get urls wrong too. And as g1smd says, Google will run forms and request urls to see what happens. Google will only index a certain amount of content from a site and seems happy to fill up your quota with non-existent pages with dupe/no content at the expense of proper content.

I've always fixed this as part of a site overhaul/optimisation so I can't say 100% that in itself it affects rankings (i.e. cleaned up non-existent pages in isolation and watched rankings improve), but my gut feeling is that it doesn't help the site's overall profile and if you don't have enough IBLs/PR to get all your pages indexed then you are definitely missing out on traffic.

bumpski




msg:3875817
 10:49 am on Mar 21, 2009 (gmt 0)

Return a 404 in the response header and show your site map page as the content. In fact make sure you return a 404; redirecting to your sitemap page and returning a 200 is a big mistake.

It's the perfect use of a well designed site map page. And your visitor sees one more of your site's well designed pages (and ads).

Then your user can "text search" your site map page if need be to find the page they truly want.

dstiles




msg:3876102
 9:44 pm on Mar 21, 2009 (gmt 0)

Ok, guys. Thanks for the feedback.

I can't say SERPS has ever been a problem with this technique.

The design of the 404 is ancient and was set up following advice elsewhere, back when google was a twinkle in the Creator's Eye. It's well over-due for a redesign. Trouble is, some of the sites are equally ancient and their owners ain't paying maintenance for 'em. They're a tight-wad lot, customers. :(

johnnie




msg:3876175
 1:07 am on Mar 22, 2009 (gmt 0)

The canonical is issue is easily resovled using the rel=canonical tag. You could also return a 404 and do suggestions (or return search results based on the URL) on your 404-page.

dstiles




msg:3876629
 10:58 pm on Mar 22, 2009 (gmt 0)

The whole issue is easy even without the canonical tag. It's time that's the issue here, since a) I manage a lot of sites and b) I have no idea off-hand which have this (potential) problem and c) there seem to be far fewer hours in each day than there used to be.

fishfinger




msg:3877660
 3:03 pm on Mar 24, 2009 (gmt 0)

The canonical is issue is easily resolved using the rel=canonical tag

Google don't *guarantee* that using this new tag will sort out such issues. I wouldn't rely on that myself.

g1smd




msg:3878676
 6:51 pm on Mar 25, 2009 (gmt 0)

For many problems the canonical tag isn't the easiest method to implement, nor is it the most effective.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved