Forum Moderators: Robert Charlton & goodroi
Google is identifying pages on the site that are of little use to the visitor, for example, maybe a product out of stock, or an option selected for a product that does not exist.
Well, it's not really. It's making up URLs using known queries and putting them together in new ways to see what comes up.
Ideally there wouldn't be any links to alaska/articles until there's something worth linking to.
Possibly they noticed that assorted queries all lead to the identical page content.
So the product page is valid and still gives results but since that specific item is deleted from the database no data is being retrieved.
"All the stuff around the edges is the same on every page. In the middle where I expected to find a bunch of unique content, there are only a few words-- and one of them is 'no'."
Yah. Your average robot should be able to manage that.
Here's the best example I can come up with without using specifics:
We sell "examples" at example.com. We have pages for "examples" in every state: example.com/alabama, example.com/alaska... ect. We write articles about "examples" for most states and list them here: example.com/alabama/articles. So lets say we have'nt written any articles for "examples" in Alaska yet. The page is still there: exampl.com/alaska/articles but instead of listing the available articles (there are none) it says "no articles for Alaska yet". google is telling us in WMT that example.com/alaska/articles is a "soft 404". And they're right, these are pages that are not adding value, that should not be there.
I don't usually give google much credit but they are nailing this! For whatever it's worth, I'm impressed that they have been able to figure it out.
Google considering those pages 404 is like claiming that a bakery doesn't sell cupcakes because they are out at this time.
There's more to it, though.
There's a big difference between a complete product page with an "out of stock" label in the corner where you place your order, and a page with no product-specific content.
Uhm... Have you entirely missed several years of discussion of the "soft 404" concept?
After all, those "soft 404s" in your example are not really 404, just very thin content pages
does your site return a 404 under other circumstances
Do you-- or can you-- add a meta "noindex" to pages of this kind?
if your ErrorDocument specifies a fully qualified url and causes a "404 response" to generate a 302/200 status chain, that's a "soft 404".
410 Gone (or 404 Not Found) means the url is gone (or not found), not the product being sold on that url.
Besides, why is a site search pointing to an empty page?
The query is part of the URL.
why is a site search pointing to an empty page?
So you could have a site where as far as the server is concerned, no request ever meets anything but a perfect 200.
these statements are irrelevant to the discussion.
before your post nobody had mentioned site search or query strings in this thread.
you are describing a CMS that was written by someone who didn't read the HTTP specification. that's not the problem being discussed here.
instead of listing the available articles (there are none) it says "no articles for Alaska yet"