homepage Welcome to WebmasterWorld Guest from 54.205.207.53
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Multiple ErrorDocuments based on request
Is it possible?
designaweb

10+ Year Member



 
Msg#: 3960803 posted 6:44 am on Jul 28, 2009 (gmt 0)

I'd like to use .htaccess to redirect requests that end up returning a 404 status. However, requests for bad .gif's should be redirected elsewhere then bad .jpg requests. Is this possible?

 

Caterham

5+ Year Member



 
Msg#: 3960803 posted 7:59 am on Jul 28, 2009 (gmt 0)

Use the ErrorDocumemnt directive in conjunction with <Files > sections like

ErrorDocument 404 /general.html
<Files *.gif>
ErrorDocument 404 /404.gif
</Files>
<Files *.jpg>
ErrorDocument 404 /404.jpg
</Files>

Always avoid regular expressions (FilesMatch) if possible.

designaweb

10+ Year Member



 
Msg#: 3960803 posted 8:47 am on Jul 28, 2009 (gmt 0)

Works like a charm, thanks! In the meantime I wrote a workaround, but this is faster me thinks?

ErrorDocument 404 /404.php

The 404.php file contained some php code, redirecting to either 404.gif or 404.jpg, based on the $_SERVER['REQUEST_URI'];

header("HTTP/1.0 404 Not Found");
header("Location: /404.gif")

Caterham

5+ Year Member



 
Msg#: 3960803 posted 9:02 am on Jul 28, 2009 (gmt 0)

header("HTTP/1.0 404 Not Found");

That doesn't work here. Your location header will cause a 302 redirect sent to the client, not a 404 not found (and the client will have to make another HTTP request; yes, that is slower of course than pointing the request directly to the gif).

Btw:
header("Location: /404.gif")

is an rfc violation, the location header must be an URL (http://example.com/404.gif), not just an URL-path, but many clients are supporting this....

designaweb

10+ Year Member



 
Msg#: 3960803 posted 10:24 am on Jul 28, 2009 (gmt 0)

That doesn't work here. Your location header will cause a 302 redirect sent to the client, not a 404 not found

Are you sure? I also use this:

if ($redirect_url) {
// Redirect to stripped URL
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$redirect_url."");
exit;
}

I just checked, and the headers returned by that script are:

HTTP/1.1 301 Moved Permanently
Date: Tue, 28 Jul 2009 10:21:02 GMT
Server: Apache/2.2.4 (Ubuntu) PHP/5.2.3-1ubuntu6.5 mod_ssl/2.2.4 OpenSSL/0.9.8e
X-Powered-By: PHP/5.2.3-1ubuntu6.5

So the

header("HTTP/1.1 301 Moved Permanently");

does seem to be working in that case, why wouldn't it in the 404 example? Your solution is more elegant and faster, but I am asking this to learn...

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3960803 posted 12:19 pm on Jul 28, 2009 (gmt 0)

Forcing the "Location" header in PHP will change the server response from your intended 404 to a 302 (redirect) as Caterham stated above.

Using a local URL-path instead of a canonical URL (http://x.y/path) is a violation of the HTTP protocol specification, but again, as Caterham stated above, *some* clients can accept this.

So, the question is whether you want to risk you site's proper operation with all clients (browsers, search robots, etc.) by returning the wrong server code along with an invalid URL. This is not a good plan for success...

Since you checked the headers on your 301 (which is fine for a 301, but not for a 4xx error response), go ahead and check your server response with your PHP 404 method. You'll see a 302, not a 404. Not good.

Jim

Caterham

5+ Year Member



 
Msg#: 3960803 posted 12:38 pm on Jul 28, 2009 (gmt 0)

Yes, that's a redirect statuscode. But it's not a 404 status. So

header("HTTP/1.0 404 Not Found");
header("Location: /404.gif")

can't give you
HTTP/1.1 404 Not Found
Location: ....
(invalid)

but instead
HTTP/1.1 302 Found
Location: ....

So you're telling robots (human users usually don't care about the actual statuscode but only about the HTTP body, so what's being displayed in their browsers) that your URL has moved to a new location. But you don't tell them that the URL actually was not found. The human will see that it's a 404 because you'll have a note in your 404.gif I think. But search engines (image search here) aren't told that the URL was not found, they're told the content of the URL was moved to some other URL.

designaweb

10+ Year Member



 
Msg#: 3960803 posted 12:50 pm on Jul 28, 2009 (gmt 0)

OK, so if I understand things correctly, the 301 method I use is fine and results in expected behaviour as long as I use the full URL, but using the same method for 404 is not.

header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$redirect_url."");

Thanks, I'll stick to the ErrorDocument method for 404's then while keeping my method for 301...

designaweb

10+ Year Member



 
Msg#: 3960803 posted 12:53 pm on Jul 28, 2009 (gmt 0)

Using a local URL-path instead of a canonical URL (http://x.y/path) is a violation of the HTTP protocol specification

Can you give me an example of a client that does not accept this? I'd like to do some testing...

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3960803 posted 3:54 pm on Jul 28, 2009 (gmt 0)

No, I can't, but it's a violation of the protocol and I don't recommend risking your site's ranking and "technical quality score" on a method that violates the fundamental protocol of the Web.

You may want to keep an eye on your raw server access log and look for the occasional requests for relative URLs that result from the user-agents which are not able to resolve relative Location paths. I recall that several mobile user-agents have this problem, but I don't recall their names.

The big advantage of 'going by the book' on the various protocols (e.g. HTTP) and standards (e.g. robots.txt) is that if and when a major search engine 'bot breaks the rules, you can point to the problem and say, "That is undeniably a bug on your end." Plus, you won't get thrown out of the search results over some 'minor' problem that the search engines used to have a work-around for, but later removed that work-around (or unintentionally broke it) while tweaking something else; We Webmasters don't have access to the search engines' back-end code, so sticking to the standards is our best bet to avoid potentially-fatal search ranking problems.

As someone posted in a thread here about duplicate content, "Get it right or perish." That's the best title I've ever seen here at WebmasterWorld so far.

Jim

designaweb

10+ Year Member



 
Msg#: 3960803 posted 8:48 am on Jul 29, 2009 (gmt 0)

jdMorgan, no doubt, going by the book is best. Modifications have been made where appropriate, my question was asked for educational purposes :-)

Both of you, thanks for taking the time to answer as in-depth as you did.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved