I am working with a PHP shopping cart and using a mod that rewrites the URL. The issue is when a product is removed and the URL is called, the code generates a soft 404 page stating the product is not found, yet the HTTP status code is 200.
As a fix I added header('HTTP/1.1 404 Not Found'); late in the page generation where the soft 404 text is generated (assuming after the original 200 code was send)
When I added this code and did a 'Fetch as Google' Google kicks back 'Not Found' which is what I want for products no longer in the store.
My question: Is sending a HTTP header status late in page generation after a 200 was sent OK to do?
Short answer: If g### reports a 404, then the header is being sent successfully.
Slightly longer answer: Most of the time, the header your server sends is the header the user receives. What you have here is the exceptional case where they are different.
The server's task is to find the php page responsible for filling the request. Once the page is found, the server reports a 200 and its work is done. It is now up to the php page to look at the details of the request and decide which response is appropriate. If the php is correctly written-- as yours apparently is-- this second response is the one the user sees.
Now, can you train yourself not to say "soft 404"? This phrase is widely used by google to mean a specific thing: a request that should receive a 404, but instead receives some other response such as a 301 redirect to the home page.
Is sending a HTTP header status late in page generation after a 200 was sent OK to do?
No, but this doesn't seem to be what you're doing, so you are OK. What's not OK is to try to send out a non-200 header after part of the page has already been sent out, not simply built. Then it's too late. A response header is, by definition, sent out together with the page. As long as you've got some kind of output buffering in place, you're fine.
Disclaimer: I only know about three words of php. The 404 header thing happens to be one of them.