I've spent the last few hours beating my head against the wall.
Background: A couple of weeks ago I created around 90 new pages in one fell swoop. I didn't really. I made four php pages with a matching set of RewriteRules:
... ^paintings/(spare[cr]at)s/(\w+)\.html /paintings/$1s/$1links.php?page=$2 ...
Paradoxically I did this to
reduce indexing. It makes sense in context, honest.
By and by I realized that if I request "paintings/sparecats/any-old-garbage.html" I get a page. A garbage page, but a page. Obviously this won't do.
Lengthy detour here to php dot net as well as to That Other Forum-- the one that writes your code for you-- to read pre-existing answers to the same question
including one that was so brilliantly worded I could have written it myself, except for the part where it also gave a factually correct answer Turns out it isn't enough to return a 404. I also, separately, need to display the content of the 404 page. Check. All is copacetic... except that the thing
flatly refuses to give me a 404. Not with "HTTP/1.1", not with $_SERVER['SERVER_PROTOCOL'], not with "Status:" I'm in php 5.3.something, so 'http_response_code' won't do. Error logs remain stubbornly empty, both in MAMP and on live site. (Test site, duh, just in case I do something disastrous.) Page displays-- or fails to display--as desired, while logs fill up with 200s.
Code is perfectly happy to redirect via a "Location:" header, so I know I haven't made any structural blunders. But I don't
want to redirect. G### has already got into the habit of requesting nonexistent files, and I do not want to encourage them.
After many hours of this, I tried a different tack: Firefox with Live Headers.
It shows a 404. Every time. Exactly as intended. But logs still show nothing but 200s.
What gives? Is the person at the other end receiving a 404, or aren't they? A human person will definitely see the 404 page at the original URL. But what will a robot get?
The current version-- still on the test site-- wraps up like this. There is an earlier ob_start() so I don't have to put everything inside "echo..." statements.
if ($done == 0)
{
ob_end_clean();
if (function_exists('http_response_code'))
{ http_response_code(404); }
else
{ header($_SERVER['SERVER_PROTOCOL'] . " 404 Not Found"); }
include ($_SERVER['DOCUMENT_ROOT'] . "/boilerplate/missing.html");
}
else
{ ob_end_flush(); }
Is this right? It gives the desired results, and the page source comes out in the right order. But "it works" isn't necessarily the same as "it's correct".
In other news, I figured out that the reason normal SSIs stop working the moment there is any kind of php involvement is that ... drumroll ... it never occurred to me to add .php to the AddOutputFilter list. Oops. Ahem. All better now. For a while there I thought I'd have to maintain two parallel sets of footers, depending on whether the file passed through php along the way or not.
I think it only took me about two months to work this out.