homepage Welcome to WebmasterWorld Guest from 54.161.185.244
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
404 header but no error page
Jump script that returns 404 but no visible page for visitor
Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 11:43 pm on Aug 4, 2012 (gmt 0)

I have a jump script for some PDF files, which checks if the destination includes our domain name. If it doesn't, it returns a 404.
I've found a couple of good threads here, especially this one [webmasterworld.com...] but I still can't get it to return our custom 404 page rather than just timining out with a blank page.
The code I'm using is

$w = isset($_GET['w']) ? $_GET['w'] : null;
if (strpos($w, "http://www.example.com") === false)
{ header("HTTP/1.0 404 Not Found");
include($_SERVER['DOCUMENT_ROOT'].'/404.htm');
die;
}

where w comes from jump.php?w=http//:www.example.com/destination.htm
Does anyone see any problems in the code? Any ideas?

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4481909 posted 12:47 am on Aug 5, 2012 (gmt 0)

no clues in the access or error logs?
have you tried running the script from the command line?

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 1:01 am on Aug 5, 2012 (gmt 0)

I'm not quite good enough on the technical side to understand how or why to run it from the command line!
From some searches it seems a few other people have this problem as well but there's never a decent solution that seems to work - the best I've come up with so far is the code above.

rainborick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4481909 posted 4:09 am on Aug 5, 2012 (gmt 0)

Try:

$w = isset($_GET['w']) ? $_GET['w'] : null;
if (strpos($w, "http://www.example.com") === false)
{
header("Location: /404.htm", TRUE, 404);
echo file_get_contents($_SERVER['DOCUMENT_ROOT'] . '/404.htm');
die;
}

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 6:21 am on Aug 5, 2012 (gmt 0)

You don't want the server to be fetching the file via HTTP from itself.

Using INCLUDE was the correct thing to do. I assume the PATH details for the include were incorrect.

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 2:49 pm on Aug 5, 2012 (gmt 0)

rain thanks for the suggestion, but I'm afraid it gives the same result, with a blank page.

g1smd the error page is at www.example.com/404.htm so I think the path's ok.

phranque here's an example from the log, which seems to be 200, though I got a 404 through liveheaders - I've been tweaking this so much so hopefully it's not an experiment:

173.000.000.000 - - [05/Aug/2012:00:39:08 +0200] "GET /jump.php?w=/folder/destination.pdf HTTP/1.0" 200 - www.example.com "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)" "93.000.000.000"

Anything looks off? The same happens in FF BTW.

rainborick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4481909 posted 5:07 pm on Aug 5, 2012 (gmt 0)

Try commenting out the call to header() so that just the 404 error page should be echoed and you can see if /404.htm is being found. If that still fails, then I'd suspect the URI (ie. path name + file name) is incorrect, or the file isn't on the server and you may have to hard-code the path to /404.htm instead of relying on $_SERVER['DOCUMENT_ROOT'].

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 5:30 pm on Aug 5, 2012 (gmt 0)

rain - tried that out just now, and the 404 page shows, so at least that's all good.
There's something about or around 'header("HTTP/1.0 404 Not Found")' that's throwing the spanner in the works.

rainborick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4481909 posted 1:55 am on Aug 6, 2012 (gmt 0)

OK, if you're going to persist with the call to header('HTTP/1.0...'), try:

header("HTTP/1.0 404 Not Found", TRUE);

Adding the 'replace' parameter set to TRUE may/should replace any previous instance of that header that would ordinarily be sent by PHP.

On the whole, though, I'd suggest you use header("Location:") as I posted. It's working code I took from one of my sites. It sends the 404 response code and shows the error page to the user. If it doesn't work, check the server error_log to see if there are any errors or warnings from PHP.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4481909 posted 3:02 am on Aug 6, 2012 (gmt 0)

the Location: header is irrelevant in a 404 response.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.30
Location
The Location response-header field is used to redirect the recipient to a location other than the Request-URI for completion of the request or identification of a new resource. For 201 (Created) responses, the Location is that of the new resource which was created by the request. For 3xx responses, the location SHOULD indicate the server's preferred URI for automatic redirection to the resource.


nothing about 4xx responses there...

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.5
404 Not Found
The server has not found anything matching the Request-URI.


you don't want to provide a location in response to a 404 - the location of the custom error page is irrelevant to the user agent or the visitor.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4481909 posted 4:26 am on Aug 6, 2012 (gmt 0)

here's an example from the log, which seems to be 200

I realize you had to edit the example to get rid of real IPs and domain names. But is it really 200 followed by a null filesize? That's your initial
timing out with a blank page

Note that the part about returning your custom 404 page is a red herring. Unless you've made a separate and unrelated booboo, a true 404 response (or any other error) will automatically result in showing the appropriate error page. The logs will then say 404 followed by a second number reflecting the actual filesize of the 404 page. So don't even think about that custom page. Just focus on getting your 404 response.

rlange



 
Msg#: 4481909 posted 3:13 pm on Aug 6, 2012 (gmt 0)

Alex_TJ wrote:
rain - tried that out just now, and the 404 page shows, so at least that's all good.
There's something about or around 'header("HTTP/1.0 404 Not Found")' that's throwing the spanner in the works.

That almost sounds like the "headers already sent" error. Is there any previous code setting headers and maybe using
flush() afterwards?

--
Ryan

swa66

WebmasterWorld Senior Member swa66 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 4:35 pm on Aug 6, 2012 (gmt 0)

[php.net...] lists two methods as special case for creating a 404 return code - but fail to give it a practical and useful example.

[php.net...] seems to be much cleaner, but ... requires PHP >= 5.4.0 .

Well trying the examples on a test server with simple scripts:

<?php
header("HTTP/1.0 404 Not Found");
exit();
?>

Yields a 404 in the logs, but a blank page in the browser, nor does the error directive in apache kick in.


<?php
http_response_code(404);
exit();
?>

Is php 5.4.0 and greater only - not easy to test as I'm not on 5.4.x


<?php
header("Status: 404 Not Found");
exit();
?>

Gives a 200 response, and a blank page.


<?php
header("HTTP/1.0: 404 Not Found");
exit();
?>

Gives a 404 in the logs but a blank page and no kicking in of the errorpage in apache.


<?php
header($_SERVER["SERVER_PROTOCOL"]." 404 Not Found");
exit();
?>

To make sure the http/1.1 protocol isn't going to cause trouble ... Gives a 404 in the logs but a blank page and no kicking in of the errorpage in apache.


<?php
header($_SERVER["SERVER_PROTOCOL"]." 404 Not Found");
header("Status: 404 Not Found");
exit();
?>


Same as above - but it's clear my server needs the first.

To make all 3 methods are tried and to output some frinedly message:


<?php
if (!function_exists('http_response_code')) {
header($_SERVER["SERVER_PROTOCOL"]." 404 Not Found");
header("Status: 404 Not Found");
} else {
http_response_code(404);
}
include ($_SERVER['DOCUMENT_ROOT']."/404.html");
?>


Caveat: I've not tried it on PHP 5.4.0 and that's an important test.

Maybe a bit sanity filtering on the use of $_SERVER would be good, esp as running it as such on the command line it now goes bonkers.

I'd love to see if anybody knows how to trigger the apache ErrorDocument action from within php.

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 6:33 pm on Aug 6, 2012 (gmt 0)

rlange - nothing else except a few echos on that page, straight into html after.
swa - great post, thanks very much for trying that all out. As Lucy mentioned, the error directive should kick in, but doesn't.
If I'm following your post correctly - nothing so far is going to work until my host upgrades to 5.4?
Alex

100 posts! :)

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 7:16 pm on Aug 6, 2012 (gmt 0)

The 404 page won't "kick in" when you send a 404 header from a PHP script.
You have to INCLUDE() the error page HTML content for people to see.

Without the HEADER directive sending the 404 response, the status as seen in the response would be 200 OK because the request has already been fulfilled by an internal file.

You need to use the Live HTTP Headers for Firefox extension to check the responses, not rely on the server logs. They only report Apache generated status codes, not those from within any PHP scripts.

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 7:47 pm on Aug 6, 2012 (gmt 0)

g1smd you genius - I'm on the road for a few days but I'll definitely try it out and report back. It seems so obvious, but!

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4481909 posted 10:49 pm on Aug 6, 2012 (gmt 0)

The 404 page won't "kick in" when you send a 404 header from a PHP script.
You have to INCLUDE() the error page HTML content for people to see.

Without the HEADER directive sending the 404 response, the status as seen in the response would be 200 OK because the request has already been fulfilled by an internal file.

Is that really what you want to do? (I mean "you" generically.) Show the user the content of the 404 page, as if they'd come by and requested "www.example.com/boilerplate/404.html" and duly received it, with attendant 200 response?

I have to ask this because it seems so contrary to what I've normally been told.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 1:43 am on Aug 7, 2012 (gmt 0)

You put a file on your server and requesting it returns HTTP "200 OK" status and whatever HTML content the file "contains" (for .html files) or whatever HTML content the file "generates" (for .php files).

Even if the file contains no content and generates no content the returned status will still be "200 OK", because a file was found in the server filesystem to fulfill the HTTP request. (You'd only get a server-generated 404 status if a file was NOT found in the filesystem.)

When the file contains PHP scripts to generate content based on the requested URL and there is no content in there for the current requested URL, the script must return a 404 status by using the HEADER directive.

The script should also send some sort of error message to the user to inform them as to what is happening. The usual way to do this is to INCLUDE() the contents of the 404 error message file.

The HEADER directive is vital here. Without it, the user would have a HTTP "200 OK" response returned along with an HTML page saying "404 Not Found". The information in the HTTP header is what searchengines need to see. Returning "200 OK" would be a major disaster.

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 6:43 pm on Aug 15, 2012 (gmt 0)

I'm back home and tried out g1smd's idea, but I'm still getting the same blank page problem.
Here's the absolute most basic code, which still gives a blank page, but at least the right header code
<?php
$w = isset($_GET['w']) ? $_GET['w'] : null;
if (strpos($w, "http://www.example.com") === false)
{ header("HTTP/1.0 404 Not Found");
echo "oops";
die;
}
?>

Any other ideas? Thanks for any help!

swa66

WebmasterWorld Senior Member swa66 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 7:47 am on Aug 16, 2012 (gmt 0)

Did you try to use
header("Status: 404 Not Found");
?
See [php.net...] : It depends on how php is integrated in the web server which one is right. I'd suggest to use BOTH.

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 6:57 pm on Aug 16, 2012 (gmt 0)

swa: tried that and the same blank page I'm afraid. Surely there must be some way to include or echo something in php after calling a 404 header?

swa66

WebmasterWorld Senior Member swa66 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4481909 posted 7:25 am on Aug 17, 2012 (gmt 0)

Problem is that what you have in post 4485169 works for me (cut&pasted, no change whatsoever):

- as long as I do not add something like ?w=http://www.example.com as a parameter to the URL, it does create a 404 result with"oops" as content.
- With that parameter is gives a 200 result that is a blank page.

It all depends if what you have in the referenced post is your real test case.
If it is not: try it for yourself (without other things in the php).
If it is, then digging deeper isn't going to come on it's own, and it's unlikely others will find it for you.

You'll have to make sure the php is evaluated, dig further in access and error logs, run it command line, look at versions, ...

Alex_TJ

5+ Year Member



 
Msg#: 4481909 posted 4:56 pm on Aug 17, 2012 (gmt 0)

swa, appreciated.
There's only the normal html after this on the page, with a meta refresh as it's a jump page. I don't think that would cause any problems:
<META HTTP-EQUIV=Refresh CONTENT="1; URL=<? echo $w; ?>">

Even stripped down to the code in msg:4485169 it's not working out. It's possible there's something wrong with my htaccess, as even normal invented urls that should 404 don't soon after I try fiddling with this, though normally there's no problem there.
Command lines and php aren't my strong point so looks like I should get professional help on this one. Thanks guys!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved