Forum Moderators: phranque

Message Too Old, No Replies

IIS Config for 404 Handler

         

rogerd

1:59 am on Jul 22, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



My host seems to be having a bit of difficulty configuring IIS using the Custom Errors Tab feature. Or, maybe I'm having difficulty setting up my 404 handler to work with his setup.

I've got an ASP file that parses the URL to determine whether to load dynamic content or give the user a custom "page not found".

At first, (I think) he did a file redirect in IIS, which worked when the error page was static but failed to execute the ASP file. Then, (I think) he changed it to a URL redirect. That caused the script to execute, but also changed the URL in the browser address bar to the error page with a query string representing the original URL that caused the error. I can parse the query string instead of the URL to deliver the right content, but is there a way around the server sending it that way? Or can I change the URL in the address bar in my error page code?

rogerd

1:35 pm on Jul 22, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Further checking of IIS documentation suggests that delivering the original URL as a query string is standard when loading an ASP error page. Soooo, I guess the question is, how do I send the desired URL to the browser address bar (and/or the spider)? I looked for a "Response." control that would do that, similar to the Response.Status that sends the error code, but couldn't identify one from the guide I was using. Any thoughts?

reinr1

2:47 pm on Sep 18, 2002 (gmt 0)



How does one set up a URL redirect for a 404 error instead of a file redirect? I want to do a similar thing with a ColdFusion dynamic error handling page, but the dynamic content does not get parsed when doing a file redirect?

Thanks,
Rich

Dreamquick

4:34 pm on Sep 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Under IIS4 & IIS5 if you have a dynamic 404 handler you ONLY have the option to have the problem URL passed to your script via the querystring. IIS5.1 fixes a issues problems but at its core works the same way.

I'll try to answer both questions but for future reference its a lot easier if both questions are on the same specific topic...

reinr1,

You configure a dynamic error handler by doing the following in the IIS Admin Manager (Web-based control panels will vary);

1) Right click the website you want to modify
2) Choose properties
3) Click the custom errors tab
4) Find the entry for 404
5) Click edit properties
6) Change the message type to URL (FILE is designed to work with static pages)
7) Insert your error handler URL into the appropriate box
8) OK your way out of the maze of forms

Congratulations you have now installed a dynamic error handler!

rogerd,

Assuming you have managed to install your error handler then you have two choices on how to change the page if required;

1) Simple approach - does the job but lacks a little finesse

Response.Redirect "/somepage.ext"

or

Response.Status = "302 Moved Temporarily"
Response.AddHeader "Location", "/somepage.ext"

These two are essentially the same - they issue a temporary redirect which tells the browser/spider to look at the new location, but in future to still come back to this page.

2) Complex approach - does the job but is a little more tricky

Response.Status = "301 Moved Permanently"
Response.AddHeader "Location", "/somepage.ext"

This issues a permanent redirect which tells the browser/spider to look at the new location, but in future to refer to the new location, updating its records in the process.

Number 2 works well with search engines for pointing old dead pages to new locations, whereas #1 is simpler but doesn't instruct search engines to update the location. Browsers should respond to 301 / 302 differenent but in practice them don't.

I've always run my 404 scripts as follows;
1) if source page has a redirect then issue the redirect
2) otherwise display a 404 page

- Tony

rogerd

5:01 pm on Sep 18, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Thanks for the reply, DQ. I'm trying to use the error handler to turn requests for static pages into dynamic content. Hence, if a request comes in for "http: //www.domain.com/Cars/Ford/Pinto", I'd like the browser or spider to NOT see a redirect, but rather get the content from "http: //www.domain.com/display.asp?brand=Ford&model=Pinto"

I worked out all my parsing code on a regular page (substituting a "?" for the first "/", e.g., testcode.asp?Cars/Ford/Pinto) and all worked fine. Right content, no change in URL, server response headers OK, etc.

Porting it over to the error page, though, caused problems with the displayed URL and server response info. The URL displays as error.htm with a query string consisting of the originally requested URL.

A Server.Transfer might solve some problems, but I gather that method won't work with a query string. I'd greatly prefer to keep the query string, since I'm interfacing with commercial catalog/shopping cart software and I do not want to rewrite their code to compensate for the lack of query string variables.

I posted a concern about the handling of REAL errors (i.e., requests for URLS that don't exist and aren't parsable into a request for a dynamic page) a while back - these don't seem to get the proper status:

In frustration, I replaced my asp error page with a plain, old static "Ack!" type page of the same name. (The server is set up to process .htm pages as asp.) Just for the heck of it, I checked the server headers as well as the browser status code. The server header came back with "302 Object moved", with the URL ".../error.htm?http:/www.mydomain.com/nopage.htm" where nopage.htm is the nonexistent page I requested. Using SpiderSim to check the browser status code, I got a "200 OK" status, and the URL was shown as http:/www.mydomain.com/nopage.htm. Neither was the 404 one would like in this situation.

Is this standard IIS behavior? I'm concerned about duplicate content and/or my error page being indexed every time the spider requests a page that isn't there. Do Googlebot and other spiders look at the browser status, the server header, or both, to determine the status of the requested page?

Any suggestions on how to accomplish the transfer, and have spiders get a 200 for "good" (parsable) pages, a 404 for "bad" (nonexistent, non-parsable) pages, and keeping the address bar URL the same? TIA.

Dreamquick

6:33 pm on Sep 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay so I assume you are okay with the redirect/server.transfer stage...

Now by this point we know that we do not have a good page to send the user to so we simply issue; "Response.Status = "404 Not Found""

and supply our normal 404 page content.

As you have noticed sometimes requests for your custom error script will cause IIS to issues its own redirect which takes the viewer to the script via a 302.

IIS versions <5.1 work like this;
1) A request which results in a 404 immediately gets given that 404 status code.
2) A request which results in a redirect ( 301 / 302 ) will first be 302 redirected by IIS to the error handler script which is then executed

In IIS5.1 the error handler script is always transparent and IIS never seems to redirect to it!

I use a similar script (I have quite an old version of a "smart 404" available for download from the site in my profile if you want to look at a different piece of sample code) and my site has PR3 at the moment so clearly the search engines are smart enough to figure out what is going on with the redirects.

If you are really afraid of duplicate content penalties when using server.transfer then simply put the real files in a directory excluded via robots.txt but which will not be part of the path used when they are viewed through server.transfer. That way the SEs are kept out of the "real" pages and only see the "virtual" pages and so only see one copy of the content.

ps.

Be aware that not all spiders are made equal, the simspider does not show you each step and simply follows redirects without telling you so if you ask for a page which you assigned a redirect then it will just follow it and show you a status 200 rather than a 30X.

However if you attempt to access a page which does not exist either as a real or virtual page and do not get a 404 then something is broken.

- Tony

Dreamquick

6:51 pm on Sep 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oops forgot to answer the last question... In theory you end up with something like this;

Can i do a server.transfer for the source page?
Yes then server.transfer

Otherwise, can i redirect for the page?
Yes then redirect

Otherwise, display 404 page.

There is no easy way to get around the limitations of server.transfer with querystrings although you could quite make some changes and have the source page look at its current pagename and extract data from that.

e.g.

"/a/aardvark" gets requested, it doesnt exist so the 404 script server.transfers to "/dictionary.asp" which then looks up "aardvark" because that was the last word included in the url (accessible via Request.ServerVariables("SCRIPT_NAME") )

rogerd

9:06 pm on Sep 18, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Thanks for the suggestions, Tony. I'll check on the web host's version of IIS, for starters. I appreciate your help, now it will take a bit of digesting and experimenting. Thanks!