|Google seeing a 302 redirect as a Soft 404|
| 2:20 am on Nov 11, 2012 (gmt 0)|
Looking in webmaster tools today on google and there was a message in our crawl errors.
Saying that we had a Soft 404.
When I watch the live http headers we're actually doing a 302 to one of our main product category list since page that they tried to access was moved a long time ago.. years even.
Anyone else seen this... could this be one of our panda issues?
| 7:11 am on Nov 11, 2012 (gmt 0)|
Redirects in general might be a problem if Google has changed how they handle them, or at least how they weight them, because many an affiliate use them. In general an article or a story won't be reached via redirect but affiliate links will be and Google's gone to great lengths on removing affiliates from top positions lately, perhaps even with Panda?
I don't know for sure but I do see a lot of "not included" results in GWT on sites that employ 301 or 302 redirects. Google doesn't specify those urls but I'd be willing to bet they are the "jump" urls.
I'm off to do some testing, you're not the first to wonder about a change in redirect reporting.
| 8:03 am on Nov 11, 2012 (gmt 0)|
We really only have 301s when products are superceded to a different part. We often have people posting links on say facebook that get changed to a new part number by manufacturers so we 301 them... it's not for gaming Google it's so that anyone visiting those links still get a valid product.
| 9:12 am on Nov 11, 2012 (gmt 0)|
|page that they tried to access was moved a long time ago.. years even. |
Moved-- or deleted?
If the page was moved years ago, but it's the same page, it should have been a permanent, one-on-one 301 to the new page all along. If by "moved" you really mean REmoved, taken away, gone, that's a 410. Google doesn't approve of mass redirects to a single page.
You can achieve exactly the same result by making a custom 404 or 410 page for humans that gives all the same information and has all the same links as a "real" page. But on paper you're returning a 410. And that makes the search engines happy.
| 1:36 pm on Nov 11, 2012 (gmt 0)|
My guess is a 301 would not make a difference here, since the difference between a 302 and 301 is mostly semantic these days, especially since time is a definite factor in redirects with Google and this one has been in place for quite a while.
I'm not 100% sure, but my suspicion, since you say the products you have change over time, is you redirected the page from where it was to where it is and for the period of time the redirect was put in place through 'no soft 404' notice, the content of the new page was similar enough to the original to be considered equivalent, but has now changed enough to be considered a 'different page', so it's treated as a soft 404.
See if I can explain what I'm thinking a bit better:
You originally redirected the page to an equivalent.
Google doesn't like to 'throw anything away', so they compare the original you redirected to the page you're redirecting to for similarity.
Over time the products on the page you're redirecting to have changed.
Recently, one of those changes hit a 'threshold' so the page on the receiving end of the redirect has become enough different to be considered a 'different page' rather than an 'essentially the same' page.
Now you get a 'soft 404' error notice.
A 410 is definitely not the right answer IMO since you still have the page and it is an equivalent even though Google's 'not getting it', because you lose all the history, age, link weight, etc. of the original page if you 410 it.
A 301 probably won't make any difference, because it's not an 'essentially the same' page according to an algo any more, so any comparison of the original to the new page will have the same effect, regardless of status code.
What is the right answer? I'll have to think about that for a bit and I'll post again if I think of something other than 'ignore it', because it's wrong on their end, not yours.
| 2:13 pm on Nov 11, 2012 (gmt 0)|
a good custom error page will have search and navigation to the rest of the site, possibly relevant to the originally requested URL, but not much content unique to that page.
your category page, especially if it acknowledges the missing content that was requested, will look similar to a good custom error page.
the only difference is the error code supplied.
a 302 to a custom error page is a signal of an improperly configured server.
therefore your response to that request probably looks more like a "soft 404" response than a "temporary move" of your content, especially if more than one URL provides a similar response.
| 2:15 pm on Nov 11, 2012 (gmt 0)|
After thinking for a bit ... If it's the situation I described where the page receiving the redirect has changed over time, I think the best answer is to ignore it, because you're not 'doing it wrong' the way you have it.
In fact, you'd be 'more wrong' to change it to either a 404 or 410 than you have it now.
The biggest reasons I'd leave it as (ignore the notice) is:
1.) By changing it to a 404 or 410 you definitely lose any inbound link weight, age, history, etc. to the original page you had, but the way you have it, now (or in the future) you may retain a bit.
2.) Google doesn't always 'get it right' the way they have things, and if they notice the pages on the receiving end of redirects are generating soft 404s over time when the receiving page is updated the way the original would have been, they will likely eventually make a change on their end, and by 'fixing' yours to a solution other than the one you have you would miss out on any future adjustments they make in the handling on their end.
Note for future readers since it seems some like to post the technical difference in status codes according to strict protocol but ignore the actual handling of the status codes by search engines, even though this is a Google SEO Forum, not a 'HTTP/1.1 Protocol Forum', and search engine handling of status codes differs from the letter of the code handling 'by the book' a bit:
A 301 and 302 are mainly a semantic difference these days, so which you use is not a 'huge gotta change it' deal if it happens to be a 302. Yes, a 301 is 'more correct', but a 302 will work 'essentially the same'.
A 404 and 410 are purely a semantic difference and Google will not 'like your site more' because you use a 410 or 'like your site less' because you use a 404.
The only difference I know of in handling by Google is they will stop requesting a 410 sooner than they will stop requesting a 404, so if you 'freak out' at 404 notices in your WMT account, use a 410 and they'll go away sooner ... Other than that, it really makes no difference in the overall scheme of rankings whether a page that's 'not there' generates a 404 or 410.
[edited by: TheMadScientist at 2:35 pm (utc) on Nov 11, 2012]
| 2:34 pm on Nov 11, 2012 (gmt 0)|
|mainly/purely a semantic difference |
there is a technical difference and if you get it right then google will recognize this and treat it properly.
if you're doing it wrong, then google will sometimes recognize this, especially if you are big enough to matter, and eventually will sort it out, but there will typically be a noticeable lag.
it is true that these types of technical problems are so common that google doesn't penalize sites per se for exhibiting them, but there can be side effects that affect your ranking.
| 2:39 pm on Nov 11, 2012 (gmt 0)|
A 301 v 302 and 404 v 410 are not critical mistakes these days phraque ... Really, they're not ... Yes, one is 'more accurate' than the other in each case, but the statement lucy24 makes about making search engines 'happy' by using a 410 rather than a 404 is inaccurate again ... Your site's rankings will not change because you adjust your 404s to 410s ... You won't move from 10 to 9 or 107 to 106 even because you change the status code for a page that's not there any more from a 404 to 410.
Google will stop requesting a page with a 410 sooner and will not request it as soon again when it has a 410, but over time a 404 on the same page will behave in exactly the same way, and again, your rankings will not change due to the use of one or the other. Your WMT notices will drop off sooner with a 410, but that's the only difference ... lucy24 posts a bunch, and you're very quick to jump to her defense, but posting constantly !== constantly correct.
| 2:49 pm on Nov 11, 2012 (gmt 0)|
i'm not defending anyone and i said nothing about "like" or "happy" in my post.
i'm telling you what the protocol says and that is what google follows as a matter of first resort.
every day i see urls indexed that 302 to other urls.
"some day" google may replace those urls in the index.
if those were 301s from the start they would not be indexed in the first place.
every day i see urls indexed that 404.
"some day" google may remove those urls from the index.
on the other hand, i don't recall ever seeing a url indexed that responded with a 410.
| 2:58 pm on Nov 11, 2012 (gmt 0)|
Whether the URLs that are 302 rather than 301 are replaced or not, the visitor ends up in the right place and, IMO, the ranking will not change with the replacement of the URL in the SERPs. (They will be replaced).
404s which were indexed will continue to be indexed for a period of time, because it's 'not found' which means it could be present again, but they will be removed.
410s are not indexed at all and you lose all inbound link weight, age and other associated information with the page, so in this case, the suggestion to use a 410, IMO, is completely incorrect ...
[edited by: Robert_Charlton at 7:11 pm (utc) on Nov 11, 2012]
| 3:09 pm on Nov 11, 2012 (gmt 0)|
|if those were 301s from the start they would not be indexed in the first place. |
IMO, it takes Google up to 3 weeks to fully trust a 301 and redirected URLs (even with a 301) are not changed 'immediately'. It usually takes a day or two even after crawling, sometimes up to a full week in cases I've seen.
[edited by: Robert_Charlton at 7:15 pm (utc) on Nov 11, 2012]
| 8:53 pm on Nov 11, 2012 (gmt 0)|
Exactly my point. Some of the sites or people that linked to it did so years ago. So to give potential real traffic the information that they were looking for we redirect it to the new page. .. but say one of our sales guys rebuilds a link in our categories that is that old category name then the page would be valid again. It's not likely, but has happened before. If we did a 410 to be rid of it now and the page or category came back again what would google make of that.
But we're really doing it for the end user and not for Google, but I don't like seeing a page being considered a "soft 404" When it's really just redirecting the viewer to a page where they can find what they're looking for... hopefully.
We didn't intentionally make it a 302.... in asp it's a simply response.redirect ... there's no status code given, it's transferring them. Maybe I'll just set it to buffer the response and give it another status.
Response.Buffer = True
(whatever conditional statement here)
Response.Status = "404 Not Found"
Google can then get their code they want, but I'll likely lose the link juice if there is any.
| 9:07 pm on Nov 11, 2012 (gmt 0)|
Yeah, like I said before, I personally wouldn't change it to a 404 for the reasons you're stating about the link weight and visitors ... I might actually change the redirect status to a 303 See Other and see how Google handles that if I was going to make any type of change.
I think you're right in your answer being for visitors and Google has had it correct until now, so obviously, the 302 is being recognized and 'counted' otherwise you'd be asking why the wrong page is showing in the results or something along those lines, which means in this situation a 301 will definitely not correct the issue, because the 'issue' is between the content of the original page and the content of the destination, not whether or not the redirect is being handled correctly.
And, I really can't see 'just throwing away' the page's links, history, etc. like a 404 or 410 will both do, especially when you have an 'essentially the same' product visitors will likely want to find, which may again change at a future date, so it sounds to me like the index where visitors can find whatever 'essentially the same product' you have at a given time is the right place to send them. (Ignoring search engine handling for half a second, and doing what they say, since they say to build your site for visitors rather than them, of course).
You having it right for visitors and the other reasons you state, plus Google having it right previously is why my initial response is to not change anything and let them deal with the soft 404 however they want ... Also, by changing it, you will not know if they change their handling back to whatever it was previously when they had it right and you were not receiving the 'soft 404' notice, so by changing it you could well be 'fixing' something that they may 'refix' on their end, and once you change it there's no way you'll know if they reverted or made another change in their handling of the issue that would cause the 'soft 404' notice/treatment to be removed.
Every time I think this one through and ask myself:
"What's the right thing for visitors?"
"Redirect to the index page."
Every other answer involves search engines and building a site for them, which is what they say not to do, so if search engines were totally out of the picture and I just had to worry about visitors, what I would do is what you're already doing ... I certainly wouldn't 404 or 410 the page if I'm thinking only about visitors (or even visitors and search engines), there's no way, so other than maybe trying the 303 redirect as a test I don't see myself changing anything in this specific situation.
| 9:53 pm on Nov 11, 2012 (gmt 0)|
One more addition:
The only reason I would try a 303 See Other is it's not an often used status code, so there's really no info on how G will handle it and it's something I might test in this situation, but my gut feeling is it's going to be handled essentially the same as a 301 or 302, because they kind of have to handle redirects as redirects to prevent 'gaming' or 'hijacking' like the 302 handling years ago resulted in, but there's also really no way of knowing for sure, except to test it in some different situations and this is one situation I'd be inclined to test and see if it makes any difference in handling or not.
Again, my gut feeling is:
It won't make a bit of difference.
BTW Bewenched ... From the posts of yours I've read, to me, it sounds like you're pretty well on top of things and do things the right way for the right reasons, so I doubt one soft 404 is going to tank your site ... If you had 100(s) of them, then it might be something you need to address, but again, to me, it sounds like you know what you're doing and do a good job of doing things for the right reasons and I doubt one soft 404 (or even a couple) will make a significant difference in your rankings, because it sounds like you do a solid job to me.
| 10:29 pm on Nov 11, 2012 (gmt 0)|
|But we're really doing it for the end user and not for Google, but I don't like seeing a page being considered a "soft 404" When it's really just redirecting the viewer to a page where they can find what they're looking for... hopefully. |
when someone requests a url and you redirect them "to a page where they can find what they're looking for" that's a "soft 404" and apparently google has finally recognized it as such.
if these valuable inbound links are sending relevant traffic then it is probably worth creating some relevant and valuable content to serve those url requests.
| 10:53 pm on Nov 11, 2012 (gmt 0)|
Phranque is correct. There is a big difference in how Google treats a 302 and 301 redirects. Yes, the "user" gets to the same place, but it's different from Google's perspective, and how they affect rankings. Hell, a meta refresh would also get the user to the same destination but it doesn't mean it's a smart thing to do from an SEO perspective.
A 302 redirect says essentially, "Index the 'content' located at the page's new, temporary location but associate it with the old URL." The old URL is not deindexed. And the new URL can ALSO get indexed if other inbound links directly to it can be found. Now you have duplicate pages accessible under the old and new URL. And credit for links pointing to the old URL are NOT transfered over to the new URL. Say the old URL had 100 inbound links and the new URL also acquires 100 inbound links. As a result of using a 302 redirect, Google is going to see this as two URLs (each with duplicate content) with 100 links each.
A 301 on the otherhand tells Google that the page at the old URL has moved permanently, so they will transfer credit for the old URL's backlinks over to the new URL. If the old URL has 100 backlinks and the new URL has acquired 100 dirct backlinks of its own, the new URL will be given credit for 200 URLs, not 100 (as it would for the 302 scenario above).
And yes it takes several weeks typically for 301s to be processed. You have to wait on the engines to recrawl EVERY inbound link to the redirected URL, discover the 301 for each individual inbound link, transfer credit 1-by-1 for each link as they recrawl them individually. Depending on how often the sites that link to you get crawled, this could take days, weeks, or maybe even a month or two.
If you're using 302 redirects instead of 301s when pages are moved or deleted, you are creating duplicate content and split link equity. The results of doing so are essentially the same as URL canonicalization issues. Not a great idea IMO.
| 11:39 pm on Nov 11, 2012 (gmt 0)|
|If we did a 410 to be rid of it now and the page or category came back again what would google make of that. |
Google respiders every URL they have ever seen, forever. They revisit URLs previously returning "410" less often than those returning "404". In a conversation with Pierre Far, he said that "Google revisits them (both 404 and 410) because a large number do eventually come back to life."
| 12:03 am on Nov 12, 2012 (gmt 0)|
|There is a big difference in how Google treats a 302 and 301 redirects. |
You're both stuck 5 or more years ago...
There is not any longer a big difference.
Check out tedster's comment in this thread: [webmasterworld.com...]
Catch up with the search engines before you continue giving misinformation, please, for the sake of those who read here ... You're both spewing inaccuracies that have No Bearing whatsoever on the specific topic of this thread.
|"Index the 'content' located at the page's new, temporary location but associate it with the old URL." The old URL is not deindexed. |
What is this, 2007?
Even the information given in the OP of this thread contradicts and proves your point incorrect.
The redirect has been in place for years. It has not been an issue.
(It means the quote I cited above is just plain refuted, even in the very thread it's posted in ... It's unbelievable to me someone would try to say a 302 is handled in a greatly different way than a 301 in a thread where a 302 has been in place and a non-issue for years. How do we know it's incorrect to say the originating URL will still be credited and indexed? How about tedster's info in the linked thread and the info in the OP of this one. If, of course, you haven't seen it yourself in the results, which I have.)
All you have to do is read the OP to know a 302 is not the issue, because it's been in place for years and Google didn't just revert their handling of redirects to circa 2007.
You're both Totally Incorrect.
If you two would actually read the threads you post as if you're an authority on the subject in, you would see how wrong you are. The 302 in this thread is being handled just fine. The first thread phranque and I got into it in, he said the only way the issue with the results happened was the robots.txt, even though one of the two sites in question didn't have a robots.txt, so it Could Not Possibly be the robots.txt.
Your posts are helping no one, because they're incorrect and completely ignore the information already given about the specific situation(s), so what they do is send people on a wild goose chase armed with nothing but misinformation.
| 2:29 am on Nov 12, 2012 (gmt 0)|
i can show you a fortune 500 company that 302 redirects http://www.example.com/ to http://www.example.com/us/en and google properly (according to protocol) indexes http://www.example.com/ and i can also show you another fortune 500 company that 301 redirects http://www.example.com/ to http://www.home.example.com/example/home.jspx?cc=US&lc=eng and google correctly (according to what's best for the search results) indexes http://www.example.com/.
if you aren't a billion dollar company i would suggest doing it properly (according to protocol) and hoping google does what you intended, instead of depending on their good graces to correctly recognize the intention of your incorrect signals.
| 2:38 am on Nov 12, 2012 (gmt 0)|
I agree with TheMadScientist. Google has moved on, and they try very much to accommodate common technical missteps. The way I think they see this area is that ranking well should be based on the quality of content offered to the visitor, rather than technical precision. If they can find a way to sort it out, they do.
Does this mean you should make technical errors with no concern? not really - there is astill a chance you are making things difficult for Google to rank a site appropriately.
Correct a soft 404 if you can see that it really is a technical error - it could be the right step to take. But if it's not a source of trouble for you and there's only one occurrence. What I've seen as a common "soft 404" report from Google is that many redirects from truly different URLs point to the same content that returns a 200 OK in some way, even if that content doesn't actually say "not found".
Redirecting lots of URLs to the same category page can trigger the soft 404 report. And if you check your analytics you might well find that the traffic involved isn't doing you much good.
| 2:46 am on Nov 12, 2012 (gmt 0)|
|Every time I think this one through and ask myself: |
"What's the right thing for visitors?"
"Redirect to the index page."
Really? This visitor-- speaking strictly as a user now-- absolutely HATES getting redirected to the front page in response to a 404. I don't even know if I simply misspelled the page name, so I have to start all over again.
Incidentally when I said "makes search engines happy" I really should have said "makes google happy" assuming for the sake of discussion that there is a difference. I keep a cursory record of redirects-- here meaning anything other than flat 200 or 403-- and I see that some groups of long-410'd pages are still visited regularly by the bingbot, long after google has lost interest. This behavior may be worth investigating. In a different forum, probably.
| 3:16 am on Nov 12, 2012 (gmt 0)|
so i guess we all agree:
- if you do it wrong and google "can find a way to sort it out, they do", or not.
- if you are willing to take "a chance you are making things difficult for Google to rank a site appropriately" then you can ignore protocol.
| 3:54 am on Nov 12, 2012 (gmt 0)|
|absolutely HATES getting redirected to the front page in response to a 404. |
The index page in this context (the context of this thread) is the index page of the section where the specific item the visitor is looking for can be found. Read the entire thread for the full context.
And, Google is not more or less happy over a 410 than a 404, they just treat them with a slight difference. Neither one nor the other will make any difference in rankings, but a 410 will remove a URL from the SERPs nearly immediately where a 404 will take a bit longer and a 410 will be recrawled less frequently sooner than a 404.
|so i guess we all agree: |
- if you do it wrong and google "can find a way to sort it out, they do", or not.
- if you are willing to take "a chance you are making things difficult for Google to rank a site appropriately" then you can ignore protocol.
You're preaching 'exact protocol' WRT a search engine that does not follow exact protocol which makes following exact protocol unnecessary (and you're not even correct about what the status code that should be used according to the exact protocol definition is in this specific situation - see below), so I have to say, "No, we don't all agree..."
If Google followed strict protocol, then following strict protocol would be important, but since they don't it doesn't make a hill-of-beans difference if your protocol is 100% correct in each and every situation, and the specific protocol used (302 redirect status code) in this situation is not and has not been the issue, possibly because it's technically the correct status to use.
The redirect is undefined by the server other than as 'Found' and the redirect may or may not be removed in the future (read the posts in this thread to see where that's stated), which means the actual specific redirect code that should be used is NOT a 301:
|301 Paragraph 1: |
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs.
Since the specific redirect discussed in this thread is not known to be a permanent redirect, meaning it May be altered on occasion (read the thread to see where that's stated), we now have 302 Found, 303 See Other, and 307 Temporary.
303's Main intended use is totally different, even though I said I'd give it a shot in this situation, as far as technically correct goes, it's out.
Which leaves 302 Found & 307 Temporary.
The following are the definitions of each:
|302 Paragraph 1: |
The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
307 Paragraph 1:
The requested resource resides temporarily under a different URI. Since the redirection MAY be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
302 Paragraph 2:
The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).
307 Paragraph 2:
The temporary URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s) , since many pre-HTTP/1.1 user agents do not understand the 307 status. Therefore, the note SHOULD contain the information necessary for a user to repeat the original request on the new URI.
302 Paragraph 3:
If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
307 Paragraph 3:
If the 307 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.
So, with the exception of HTTP/1.0 User Agents 'not getting' what a 307 redirect means, a 302 and 307 are technically described as the same *bleeping* thing, which means using just a bit of reasoning we can very easily determine:
302 is the correct status code in this situation according to Exact protocol definitions, because we know from reading this thread the redirect Might be altered (removed) on occasion ... Go figure.
My guess is the use of a 301 is actually technically incorrect, much as it would be in this situation...
[edited by: tedster at 4:35 am (utc) on Nov 13, 2012]
| 4:36 am on Nov 12, 2012 (gmt 0)|
|we don't know if it's a permanent redirect, since it's been stated in this thread the redirect May be altered (removed) on occasion |
Under what circumstances would a human webmaster know beyond the shadow of a doubt that a redirect will never under any circumstances be undone?
|Why on Earth do people Insist on arguing with me when they're wrong?! |
I think it's called the Troll Effect. The more loudly and frequently someone declares I Am Right And You Are Wrong, the more people rush to take issue. This phenomenon is not restricted to internet forum behavior.
| 4:42 am on Nov 12, 2012 (gmt 0)|
Cool URIs Don't Change [w3.org]
| 4:47 am on Nov 12, 2012 (gmt 0)|
It's much more likely I'd be wrong if I couldn't back up my position with sources, unfortunately for those who like to disagree with me, I can, which means whether I 'shout loudly' or 'type quietly' really doesn't matter, because the sources of information I provide tell the truth about my position to those who would like to actually know the right answer, much more than quips and indirect name calling ever will.
BTW: If you didn't read the whole thread, the guy who just got the Life Time Achievement Award for his work here, who also has probably forgotten more about SEO than I, or most people, will ever know said he agrees with my position ... I'm fairly certain I'm correct.
| 5:16 am on Nov 12, 2012 (gmt 0)|
I think people have stated their positions and now we're just re-hashing things. So this thread is closed.