Forum Moderators: DixonJones

Message Too Old, No Replies

Identifying 404 errors in logs.

How are 404's documented in the logs?

         

Broadway

12:20 pm on May 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've just migrated a site to a new server and I checked my site's statistics and have found that a high percentage of "Code 404 - Not found" errors are listed.

I downloaded one day's log and searched for the term "404". I thought I was going to find that I forgot to upload one entire directory and the missing pages would be easy to identify.

It didn't turn out that way, nothing was painfully obvious. Since the string "404" is in my logs for a variety of reasons (such as a part of IP addresses, etc...). How are 404 errors documented in a Windows server log? What is the position of the "404" in each line that wouold indicate a "404 - not found error" for that page.

jdMorgan

1:21 pm on May 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Server log formats vary between server types, hosting providers, and even hosting plans.

Search for " 404 " -- That is, <space> 404 <space>. That will eliminate false matches on IP addresses and such.

Comparing several lines, with both successful (200-OK or 403-Not Modified) and unsuccessful (404-Not Found or 410-Gone) requests, should point you to the proper column.

Here are two typical Apache log lines. The first indicates success, while the second is a 404:

62.**.213.196 - - [04/May/2007:06:02:30] -0600] "GET /score.xls HTTP/1.0" 200 131411 "http://www.example.com/spreadsheet" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11"

62.**.213.196 - - [04/May/2007:06:02:30] -0600] "GET /score.xls HTTP/1.0" 404 573 "http://www.example.com/spreadsheet" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11"

In order, the fields shown here are:

    Client IP address or hostname
    Client identd (Remote user ID)
    Remote user (login name on your server)
    Time
    First line of client request header:
    . HTTP Method (GET, HEAD, POST, etc.)
    . Requested local URL-path.
    . Request Protocol
    Server response code (200, 404, etc.)
    Server response size (in bytes)
    HTTP referrer
    Client User-agent string (Firefox 1.5.0.11 on Win XP with British English language preference shown here)

Windows servers also typically show response times, but as I said above, log formats vary quite a bit.

Jim

cgrantski

3:53 pm on May 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sound like you have IIS logs. Look for the field "sc-status". I believe this field is not turned on by default - you will have to go into the site's properties, to the logging area, and find the list of logged elements. The second tab in that screen is where this is (I think).

Broadway

8:21 pm on May 4, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks to both of you. I did the search for other status codes like you suggested (304 , 404 , etc...) and found the common location in each line. And yes, it turned out that this information was held in the sc-status field. Thanks again.