homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

rel=canonical and 404 header response
Am I forcing Google into an infinite 404 loop?

 4:05 pm on Apr 27, 2012 (gmt 0)

Hope someone can shed some light on this...

A while ago, I noticed our pages that should have been returning a 404 header response were actually 302ing to a custom 404 page first. I've since fixed this with PHP to work out whether or not the requested URL should return a page - if it shouldn't, I return a 404 response and include the custom 404 text with PHP.

Since doing this, reports of soft 404s in GWT have retreated to 0 but, naturally, 404 errors have skyrocketed.

This doesn't bother me, because the pages shouldn't have existed in the first place - and as long as we're not LINKING to any of those internally, Google should eventually give up on them and play nice.

However, I'm now concerned because Google is reporting that the pages returning a 404 header response are in fact linking to themselves
e.g. mywebsite.com/page_that_doesnt_exist.html returns a 404 but - according to GWT - is being linked to from mywebsite.com/page_that_doesnt_exist.html

The only link I can see on the resulting page is the rel=canonical

So my question is - Is Google ever going to give up on this page? Or is the fact that I'm generating the 404 response AFTER directing Google to the 'appropriate' canonical URL, forcing Google to attempt to index the page again and treating it as an internal link?

Hope that makes sense to someone!



 6:12 pm on Apr 27, 2012 (gmt 0)

So my question is - Is Google ever going to give up on this page?

No. Google never forgets an URL. They will slow down after a while though.

And yes, I've got 404s that are linked either from nowhere at all-- at least, nowhere that g### will admit to-- or only from pages that are themselves 404s. Or 410. Sigh.


 11:08 pm on Apr 27, 2012 (gmt 0)

I wouldn't include the rel="canonical" tag on the 404 page.


 12:23 am on Apr 28, 2012 (gmt 0)

also make sure your custom error page is specified as a relative url and not an absolute url that specifies the hostname.
i.e. THIS:
ErrorDocument 404 /path/to/custom-404.php

ErrorDocument 404 http://example.com/path/to/custom-404.php


 3:03 pm on Apr 28, 2012 (gmt 0)

That's interesting Lucy, not just me then.

g1 - it's not a 404 page - as I mentioned that method had Google complaining of soft 404s. Therefore unrecognised urls now return a 404 response and -include- my custom 404 text. There is no redirect whatsoever. Because of this, if I removed the canonical, it wouldn't appear on recognised urls either :/

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved