Forum Moderators: phranque

Message Too Old, No Replies

Servers Display Incorrect Error Message for a 403 Situation

Server Generates Wrong 403 Error Message

         

GoldenEagles

3:49 am on Aug 29, 2007 (gmt 0)

10+ Year Member



Dear Forum Members,

The servers used by my webhosting company display the wrong error message under a 403 situation, and I cannot find the right words to help the tech support people understand that there is a problem at all. I am hoping that there is someone here who can give me a clue as to how this could occur, so I can pass it on to them, and get the problem solved.

This is the situation.

I am using the .htaccess file to block some nuisance IP's. No problem with that. The IP's get blocked.

But, the server does not give the blocked IP a standard apache default 403 error message. Which I believe is something like "Forbidden, You do not have permission to access this page", or something like that.

Instead, the server gives the blocked IP this error message, and I will abreviate it: DIRECTORY HAS NO INDEX FILE etc. I can block my own IP and that is the error message I get. No matter what page or directory I go after, that is the error message I get.

If someone could please tell me what apache configuration file would have to be altered in order to generate this (erroneous) error message in place of a default 403 Forbidden error message, I would appreciate it. I could pass that on to the tech support people to point them in the right direction so the problem can be fixed.

(By the way, there are no custom error pages defined in my .htaccess file.)

Thank you for your help,
Sincerely,
GoldenEagles

jdMorgan

12:29 pm on Aug 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It sounds like your 403 ErrorDocument is defined by the host, is defined as an index page, and is missing. Further, it sounds like you've got Directory Indexes enabled, but no directory at that URL.

This is not a standard Apache message, and so I suspect they may be using a script to generate ErrorDocuments.

I should note that the text of error messages matters only to humans; It is the server status code that matters to search engine robots and browser software.

Use the "Live HTTP Headers" extension to FireFox/Mozilla browsers to check the status codes returned by your server for all "error" conditions, including 400, 403, 404, 410, 500, and 302 and 301 redirects.

If your host is 'clueless' they may have just bought some standard packaged server setup form someone; I'd be looking to host with a major hosting principal instead -- Someone who owns their own server farm and has people on-staff that know how they are configured. It certainly should not be up to you to explain your hosting setup to your hosting provider!

If your server headers don't check out properly, and if you cannot get any guidance from your host, consider installing your own ErrorDocuments so as to 'take over' from their error handling setup (if possible). I'd suggest defining just one ErrorDocument and testing before committing to do all of them; It may not be possible on this host to serve your own error pages, and there's no use wasting a lot of time if this is the case.

In .htaccess:


Options -Indexes
ErrorDocument 403 /local-URL-path-to-403-error-document.html
#
SetEnvIf Request_URI "^/local-URL-path-to-403-error-document\.html$" Allow_it
SetEnvIf Request_URI "/robots\.txt$" Allow_it
#
Order Deny,Allow
Deny from [i]403.test.IP.here[/i]
Allow from env=Allow_it
#

Note that in both cases, the URL is a local URL-path; It must NOT contain the http://example.com protocol and domain prefix. The purpose of the SetEnvIf code is to be sure that ALL clients are Allowed to fetch the 403 error document; If this is not implemented, then a 430 will cause another 403, and another... Leading to an even worse situation than the one you report. And we also need to Allow all clients to fetch robots.txt, so we can ban them without doubt or remorse if they violate it.

Note that in .htaccess, you can have only one "Order" directive, unless those directives are made mutually-exclusive by enclosing them in containers such as <Files> or <Limit> that guarantee only one Order directive will be applied to a given HTTP request. If not, then the last one in the file that applies to the conditions of the current request will be applied.

Jim

GoldenEagles

9:35 pm on Aug 29, 2007 (gmt 0)

10+ Year Member



Dear Jim,

Actually, the web hosting service is a large one. Though, for the time being, I will refrain from publishing their name. When I signed up it was located in the Los Angeles area, I believe it was San Gabriel, and the people who owned it, actually ran it, and I never had any support issues come up that left me with my jaw on the ground. But, they were bought out by a larger company and ever since the tech support situation has deteriorated dramatically. The Visual Route display to my hosting service points to a place in Massachusettes now. It sounds like the clientelle from the original business was just transferred to a larger server farm, belonging to a much larger company, and that customers are forced to deal with a lower echelon of tech support people that are no longer on site, and are not physically in touch with the servers or systems they are supporting. And they must send any technical issue up the line, and what we get back goes through the filter of this lower echelon, who do not sound like they are actually trained on apache servers. Anyway, that is what it feels like to me.

The price we pay for hosting could not possibly support the hiring of Harvard graduate Computer Science majors to man their technical support team.

In regards to the directory index issue. I have -indexes defined in my .htaccess file, and when you actually try to browse a directory that has no index file defined, you get the same message, DIRECTORY HAS NO INDEX, which is proper for that situation.

But of course, there is something wrong when a blocked IP gets that same "DIRECTORY HAS NO INDEX" error page.

In regards to overcoming this problem by defining custom error pages in the .htaccess file, yes, this can be done, but that generates another problem, and that is, that all 403 codes in my access logs are changed to 302 redirects, where the server is redirecting the user to the custom error page, which I understand is appropriate behavior on the part of the apache server. In other words, changing the 403 codes in the access logs to read 302 for redirect, is what the Apache server is designed to do. But, I need the 403 codes intact in the access logs so I can track and analyze the behavior of previously blocked IP's. So, while it is feasible to define a custom error page for the 403 situation, I cannot do that because I lose the 403 codes in my access logs.

You say, "Use the "Live HTTP Headers" extension to FireFox/Mozilla browsers to check the status codes returned by your server for all "error" conditions, including 400, 403, 404, 410, 500, and 302 and 301 redirects."

That sounds interesting, something that I would like to followup. Am I correct that you are indicating that this is a function associated with FireFox/Mozilla browsers, and not Microsoft IE? In as much as a search of the IE help did not return any hits on that subject?

Thanks for your time.
Sincerely,
GoldenEagles

jdMorgan

9:52 pm on Aug 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In regards to overcoming this problem by defining custom error pages in the .htaccess file, yes, this can be done, but that generates another problem, and that is, that all 403 codes in my access logs are changed to 302 redirects...

That is only the case if you ignore what I typed above:

Note that in both cases, the URL is a local URL-path; It must NOT contain the http://example.com protocol and domain prefix.

It is a documented behaviour of Apache that if you use a full URL in ErrorDocument, the server must (and will) generate a redirect. If you use a local path, as specified in the documentation, this won't happen.

IE does not support the extension I mentioned. I recommend that you use Firefox or a Mozilla-based browser for most Webmaster work; They have TONs of Webmaster and SEO extensions, and they're safer to boot, since ActiveX is not supported...

Jim

[edited by: jdMorgan at 9:54 pm (utc) on Aug. 29, 2007]

GoldenEagles

1:45 am on Sep 1, 2007 (gmt 0)

10+ Year Member



Dear Jim,

In the back and forth I have gone through with my webhosting service, at the onset, they tried to solve the problem using a custom defined 403 error document in my .htaccess file. And if I remember correctly, they used only a local path, but the 403 error codes disappeared from my access logs anyway. They all changed to 302. That is what I remembered, and so, I dug in my heels, and I asked them to stop altering my .htaccess file. That is the reason for my statement to you, out of a sense of certaintly that I was correct on that point. However, I will make it a point to recheck that, to test it out, and I will return here to report what I find, just so there is no question about it.

My main focus however, is to work through the problem with the erroneous error message at the system level first. If they can be persuaded to generate an appropriate 403 error message, I would prefer to go with that, rather than to define a custom page of my own, at least at this point.

What they are doing, is delivering exactly the same error message in two different error situations. Error situation No. 1 would be the case where the webmaster has blocked the browsing of directories. As you have in your example above "-indexes". Error situation No. 2 would be the case where a particular IP has been blocked (deny from nnn.nnn.nnn). In both of these situations, which are totally different, their servers deliver the same error message, which is as follows, though the text is centered on the html page.

==== Start Error Quote =====

Directory has no index file.
Browsing this site or directory without an index file is prohibited.
If you are the site's webmaster, you can remedy this problem by creating a default HTML page with one of the following names:

index.html
index.htm
default.htm
Default.htm
home.html
Home.chtml
NOTE: Filenames are case sensitive, i.e., Home.html is not the same as home.html

===== End Error Quote ======

This is fine for the situation in which directory browsing is blocked. But it makes no sense to use it in a situation where the IP has been banned. So, my first priority is to learn what I need to say to them, so they can go in there and generate an appropriate custom message for the 403 situations.

Sincerely,
GoldenEagles

jdMorgan

2:19 am on Sep 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, either situation, the -indexes Option with no extant index file, or "any old" 403 caused by IP address-, hostname-, HTTP User-agent-, or HTTP Referrer- based access denial creates a 403 status. The 403 status causes the server to look in your .htaccess file(s) for a ErrorDocument 403 declaration. Finding none, it will look at conf.d and then httpd.conf. If there is no ErrorDocument declared there, it will serve the default error document that is hard-coded into Apache.

However, the main point is that the server does not distinguish between -indexes 403s and these 'other' 403s.

Therefore, it is apparent that your hosts have chosen to "dumb down" the 403 error page, because they don't expect any of their customers to implement any of the other access controls as described above. They have chosen to output this "helpful" error message that unfortunately becomes inappropriate the instant that a Webmaster implements additional access controls other than -indexes.

I'd recommend that they stick with a message based on the meanings of the HTTP status response codes defined in the HTTP/1.1 protocol [w3.org] specification, and add their "helpful" text to that as "a possible reason" if they feel that they must do so.

However, keeping in mind that many 403s occur as the result of your response to troublemakers attempting to abuse your server, the policy I recommend is to provide absolutely no technical information; Just output a message that says that access has been denied, we apologize for the inconvenience, and here's a text link to our home page. There is no use presenting a would-be abuser with a laundry list of all the reasons he may have been denied access, or with information about the specific trap that caught him up.-- Why help him?

Jim

GoldenEagles

5:40 am on Sep 2, 2007 (gmt 0)

10+ Year Member



Dear Jim,

Thank you for that explanation. That is very interesting.

So, under the hood, you are saying that the "-indexes" declaration produces a generic 403 (forbidden) status condition whenever anyone who is trying to browse a directory which has no index file.

And the server does not differentiate that 403 condition, from the 403 condition produced when a blocked IP tries to gain access.

And so, when the administrators declare a custom error page to cover the directory browsing condition, which is quite informative for that particular situation I suppose, the server will use that same custom error page in all 403 situations.

And you are sure that there is no way to get around that? Isn't there some kind of environment variable that a script could look at, to tell the difference between the two situations?

What do you think?

Sincerely,
GoldenEagles

jdMorgan

1:28 pm on Sep 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's correct.

I'm not sure why you don't just declare a custom 403 error page and be done with this.

Jim

GoldenEagles

6:29 am on Sep 5, 2007 (gmt 0)

10+ Year Member



Dear Jim,

To answer your quesiton, the world becomes a better place faster if, when you run across a problem, you try to see what you can do to fix it while you have it before your attention. If you pass up the opportunity to try to fix a problem, the chances are, you probably will never come back that way again. And many people who come after you will stumble over the same stone in the road.

So, if there is some kind of environment variable that a script could access to help tell the difference between these two situations I would like to know about it, so I could pass that info on to the tech support people.

If not, perhaps this is something that should be fixed in the next version of the Apache Server. Are their people working on new versions?

Sincerely,
GoldenEagles

jdMorgan

2:24 pm on Sep 5, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This problem is all of your host's doing, as I explained above. They could easily fix the problem by getting rid of their "helpful-but-insufficiently-scoped" error message, and using the standard error message.

This, like "Windows Help and Support" is a well-intentioned solution gone awry -- The error message is dumbed-down to be helpful but dumbed-down too far, to the point where it is inaccurate in many other possible error cases.

Modifying Apache cannot address this; If a separate error is to be returned for server-specific problems, then the HTTP protocol itself would need to be expanded.

If they should really desire to report specific errors, they could write a script and pass all requests through that script prior to the Apache content-handling API phase. The script could check for missing requested directory indexes and output a customized error message. The downside would be greatly-reduced performance, the addition of further complexity to the server, and the requirement to re-modify, re-compile, and test their customized Apache installation whenever Apache was upgraded.

Your desire to fix the problem for all users of this hosting service is admirable; It's not the typical "Help I need a fix right now" that we usually get. However in most cases, hosts are completely unresponsive to customer requests for customization or global server configuration changes, so point fixes are usually required. If your host is different, that's great!

Jim

GoldenEagles

8:00 pm on Sep 14, 2007 (gmt 0)

10+ Year Member



If I define a custom 403 page in my .htaccess (making sure to use local paths to retain the integrity of the return codes in the access logs) wouldn't I run into the same basic problem associated with the fact that the server will still not differentiate between these two 403 situations?

In other words, if I specify a custom 403 page that makes it clear to the person that his (or her) IP has been blocked, then the same page will be delivered to anyone who tries to browse the index (-indexes). Isn't that right?

It seems to come around full circle to the most simple approach that was already suggested: delivering the default "Forbidden: Access Denied" (through a custom htacess definition) appears to be the best message that would cover both situations.

Sincerely,
GoldenEagles

jdMorgan

10:55 pm on Sep 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In fact the server does not differentiate between the many possible reasons that a 403-Forbidden response might be invoked. Remember, a server's job is to serve files as quickly and efficiently as possible. Therefore, the code is kept lean and simple -- You would be astounded to compare the filesize of Apache to, say, Windows -- or even to Word 2007. Apache is a small, comparatively-simple program.

In the case of 403-Forbidden responses, there's another angle at work here: Even though I have the option to use mod_rewrite code, SSI, PERL, and PHP to differentiate between the thousands of reasons I might deny access based on user-agent, IP address, remote hostname, GeoIP, frequency of access, user-privilege level, etc. and return a very-specific reason for access denial in the content-body of the 403 response, I would never do so. Why? Because some of the people seeing that response are malicious; I do not wish to assist them in copying/hacking/abusing my site by telling them what I do and do not check for.

Jim