Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

302 vs. 404 response codes?

         

Andiamo

10:17 am on Jul 22, 2015 (gmt 0)

10+ Year Member



We have a client that has a lot of pages of empty content on the site, in the hopes that one day those pages will be filled with actual content. Obviously, this means a soft 404 issue, since all of those pages are returning 200 codes.

Should we 302 redirect those pages to the homepage, or should we implement 404 errors on those pages? The redirects wouldn't be permanent, so I wouldn't want to 301 them.

Robert Charlton

6:06 pm on Jul 22, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'm not clear on what the setup is that's returning 200 responses, for "pages of empty content". Do you mean that the urls exist, but the pages don't? In this case, definitely return 404s... but why are the urls existing in the first place?

If you have actual page templates showing, but they simply don't have any content, I think the band-aid of choice would be the meta robots noindex tag, allowing the default follow.

While this will at least keep Google from displaying these page in the visible index, it may not keep your site visitors from navigating to them. If there really are a "lot" of such pages, it's ultimately going to be a bad user experience.

Any way to remove these pages physically, and to add them over time as content becomes available?

[edited by: Robert_Charlton at 6:14 pm (utc) on Jul 22, 2015]

lucy24

6:13 pm on Jul 22, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



200 is not a "soft 404"; it's an empty page. A soft 404, in googlespeak, is when an URL that ought to return a 404 ("ain't no such page") instead redirects to some other page, such as the / root. You can easily tell when google gets suspicious, because your logs will show requests for garbage URLs "tjrkyhvbuynthdgr.html". (This appears to be programmatically determined; it doesn't mean they're picking on you.)

In this situation it doesn't really matter whether the redirect is a 301 or a 302. And if there has never been any indexable content at the originally requested URL, then it really doesn't matter.

How many is "a lot"? The phrase "in the hopes that some day" is ominously vague. We're not talking about transitory "out of stock, check back later" pages, are we? Honestly it sounds as if you need to sit down and think about what would be best for your human visitors. What's good for google is not always what's good for a human, but it's a reasonable starting point. Nobody likes landing on a nonexistent page. Why do the URLs even exist if they have never had content?

Robert Charlton

6:17 pm on Jul 22, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



PS... I was editing my response as lucy24 was posting. I think we both are similarly confused by the original post. .

Andiamo

9:30 pm on Jul 23, 2015 (gmt 0)

10+ Year Member



Lucy24 - you're absolutely correct, soft 404s are not what I described. I had a client with a horrible filtering system on their ecommerce site, where they had tons of empty pages of content on the site b/c filters produced pages with no products on them. And GWT saw those as a 404 error. So my view of the 404s had temporarily shifted.

Honestly, I think the client was employing old-school SEO spam tactics recommended by previous agencies. But now that those pages exist, the question is what to do with them. Their content plan does include filling out those pages with relevant targeted content, but that's way in the future. So in the meantime, I think we'll change them to 404s.

Ralph_Slate

3:24 pm on Jul 28, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Google will still report a soft-404 for a page that is marked as noindex in Webmaster Tools. I am in that same situation - my site is encyclopedic, and there are instances where I have pages that will someday contain data about a particular topic, and really do need to exist for consistency on the site, but are largely empty. WMT reports the pages as soft-404s.

TheMadScientist

4:27 pm on Jul 28, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Obviously, this means a soft 404 issue, since all of those pages are returning 200 codes.

You're correct.

A soft 404, in googlespeak, is when an URL that ought to return a 404 ("ain't no such page") instead redirects to some other page, such as the / root.

You're correct.

[googlewebmastercentral.blogspot.com...]
[googlewebmastercentral.blogspot.com...]



Now that we've sorted out a soft 404 is *either* when a server returns a 200 response code for a specific URL when it should return a 404 *or* the server redirects the visitor to the homepage or another page which returns a 200 [after a 301 or 302 or 303 or 307] instead of a 404 for the requested URL, what to do about it in the following cases?

We have a client that has a lot of pages of empty content on the site, in the hopes that one day those pages will be filled with actual content.

my site is encyclopedic, and there are instances where I have pages that will someday contain data about a particular topic, and really do need to exist for consistency on the site, but are largely empty.

You can leave the pages exactly how they are as far as look and feel and location go, but set a 404 error code until they have useful content on them and not only will you stop getting notices about soft 404s, you'll save some crawl cycles for your pages that actually need to be crawled now, and by simply changing the header code rather than changing whatever page you have there now, if a visitor happens to land on them they won't know the difference, because they don't ever see the actual response code, except on an actual error page.

<?php header('HTTP/1.1 404 Not Found'); ?>

At the top of the page(s) in question is all it takes and it's done.

Added: Assumed a scripting language is used to create the pages and gave a bit of PHP as an example, obviously it may need adjusting to fit a specific situation, but go with the point, which is: it's fairly simple to have what you want and stop the soft 404 errors ;)

Ralph_Slate

11:02 pm on Jul 29, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



You can leave the pages exactly how they are as far as look and feel and location go, but set a 404 error code until they have useful content on them and not only will you stop getting notices about soft 404s, you'll save some crawl cycles for your pages that actually need to be crawled now, and by simply changing the header code rather than changing whatever page you have there now, if a visitor happens to land on them they won't know the difference, because they don't ever see the actual response code, except on an actual error page.


Won't that impact me when the content actually does get added? Google presumably will only crawl 404 sites every so often, and may even stop crawling them altogether someday.

TheMadScientist

12:29 am on Jul 30, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



1.) You don't want Google to crawl the pages now. It affects the resources (crawls) they give you for pages that actually have content.

2.) Getting Google to actually stop crawling a page is next to impossible, even if you want them to -- Anything other than an .htaccess block or hoping they don't "find a reason to visit anyway" if you use a robots.txt Disallow: is almost as futile as trying to stop water from being wet.

[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]

3.) If somehow you find some way to get them to stop, all you need to do to get them to start again is add a link to the page(s), add them to an XML Sitemap, fetch as Googlebot from the GSC, or take some other "action" that says "something changed on this URL" and they'll be by shortly.

Notes: You should *not* link the to pages without content and you should *not* have the pages without content in an XML Sitemap until there's content on them, whether they're a soft-404 or a true-404 makes no difference to the preceding.