homepage Welcome to WebmasterWorld Guest from 54.211.97.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

    
rel=canonical and 404 header response
Am I forcing Google into an infinite 404 loop?
asheridan



 
Msg#: 4446584 posted 4:05 pm on Apr 27, 2012 (gmt 0)

Hope someone can shed some light on this...

A while ago, I noticed our pages that should have been returning a 404 header response were actually 302ing to a custom 404 page first. I've since fixed this with PHP to work out whether or not the requested URL should return a page - if it shouldn't, I return a 404 response and include the custom 404 text with PHP.

Since doing this, reports of soft 404s in GWT have retreated to 0 but, naturally, 404 errors have skyrocketed.

This doesn't bother me, because the pages shouldn't have existed in the first place - and as long as we're not LINKING to any of those internally, Google should eventually give up on them and play nice.

However, I'm now concerned because Google is reporting that the pages returning a 404 header response are in fact linking to themselves
e.g. mywebsite.com/page_that_doesnt_exist.html returns a 404 but - according to GWT - is being linked to from mywebsite.com/page_that_doesnt_exist.html

The only link I can see on the resulting page is the rel=canonical

So my question is - Is Google ever going to give up on this page? Or is the fact that I'm generating the 404 response AFTER directing Google to the 'appropriate' canonical URL, forcing Google to attempt to index the page again and treating it as an internal link?

Hope that makes sense to someone!

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4446584 posted 6:12 pm on Apr 27, 2012 (gmt 0)

So my question is - Is Google ever going to give up on this page?

No. Google never forgets an URL. They will slow down after a while though.

And yes, I've got 404s that are linked either from nowhere at all-- at least, nowhere that g### will admit to-- or only from pages that are themselves 404s. Or 410. Sigh.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4446584 posted 11:08 pm on Apr 27, 2012 (gmt 0)

I wouldn't include the rel="canonical" tag on the 404 page.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4446584 posted 12:23 am on Apr 28, 2012 (gmt 0)

also make sure your custom error page is specified as a relative url and not an absolute url that specifies the hostname.
i.e. THIS:
ErrorDocument 404 /path/to/custom-404.php

NOT:
ErrorDocument 404 http://example.com/path/to/custom-404.php

asheridan



 
Msg#: 4446584 posted 3:03 pm on Apr 28, 2012 (gmt 0)

That's interesting Lucy, not just me then.

g1 - it's not a 404 page - as I mentioned that method had Google complaining of soft 404s. Therefore unrecognised urls now return a 404 response and -include- my custom 404 text. There is no redirect whatsoever. Because of this, if I removed the canonical, it wouldn't appear on recognised urls either :/

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved