Welcome to WebmasterWorld Guest from 54.198.100.0

Forum Moderators: incrediBILL

Message Too Old, No Replies

rel=canonical and 404 header response

Am I forcing Google into an infinite 404 loop?

     
4:05 pm on Apr 27, 2012 (gmt 0)

New User

joined:Apr 27, 2012
posts: 2
votes: 0


Hope someone can shed some light on this...

A while ago, I noticed our pages that should have been returning a 404 header response were actually 302ing to a custom 404 page first. I've since fixed this with PHP to work out whether or not the requested URL should return a page - if it shouldn't, I return a 404 response and include the custom 404 text with PHP.

Since doing this, reports of soft 404s in GWT have retreated to 0 but, naturally, 404 errors have skyrocketed.

This doesn't bother me, because the pages shouldn't have existed in the first place - and as long as we're not LINKING to any of those internally, Google should eventually give up on them and play nice.

However, I'm now concerned because Google is reporting that the pages returning a 404 header response are in fact linking to themselves
e.g. mywebsite.com/page_that_doesnt_exist.html returns a 404 but - according to GWT - is being linked to from mywebsite.com/page_that_doesnt_exist.html

The only link I can see on the resulting page is the rel=canonical

So my question is - Is Google ever going to give up on this page? Or is the fact that I'm generating the 404 response AFTER directing Google to the 'appropriate' canonical URL, forcing Google to attempt to index the page again and treating it as an internal link?

Hope that makes sense to someone!
6:12 pm on Apr 27, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13210
votes: 347


So my question is - Is Google ever going to give up on this page?

No. Google never forgets an URL. They will slow down after a while though.

And yes, I've got 404s that are linked either from nowhere at all-- at least, nowhere that g### will admit to-- or only from pages that are themselves 404s. Or 410. Sigh.
11:08 pm on Apr 27, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


I wouldn't include the rel="canonical" tag on the 404 page.
12:23 am on Apr 28, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10553
votes: 13


also make sure your custom error page is specified as a relative url and not an absolute url that specifies the hostname.
i.e. THIS:
ErrorDocument 404 /path/to/custom-404.php

NOT:
ErrorDocument 404 http://example.com/path/to/custom-404.php
3:03 pm on Apr 28, 2012 (gmt 0)

New User

joined:Apr 27, 2012
posts: 2
votes: 0


That's interesting Lucy, not just me then.

g1 - it's not a 404 page - as I mentioned that method had Google complaining of soft 404s. Therefore unrecognised urls now return a 404 response and -include- my custom 404 text. There is no redirect whatsoever. Because of this, if I removed the canonical, it wouldn't appear on recognised urls either :/