Forum Moderators: phranque

Message Too Old, No Replies

Raw logs not recording full URI

         

Karma

1:22 pm on Oct 20, 2009 (gmt 0)

10+ Year Member



Hi,

I'm looking through my logs and see that the exact URI has not been recorded. For example, if Google requests the www. version of a URL, I 301 redirect to the non-www. version but all I see in this instance is:

GET / HTTP/1.1 301

When I'd really like to see:

GET [mysite.tld...] HTTP/1.1" 301

Is this possible?

jdMorgan

2:43 pm on Oct 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could use mod_log_config, and prepend the "HTTP Host" header value to the logged request line sent by the browser.

Remember, HTTP clients (browsers, robots) don't request "mysite.tld/page.abc", they send a request for "/page.abc" to the IP address returned for a DNS lookup of "mysite.tld". In HTTP/1.1 (but not in true HTTP/1.0), the client sends a separate HTTP request header containing the hostname, in this case, sent as "Host: mysite.tld". This host header is not needed unless the server is a name-based virtual host, otherwise, it is redundant with the fact that the request has already arrived at this server at this IP address. On name-based virtual hosts, the Host: header is used to 'sort out' which of the name-based virtual hosts at this IP address is being requested to service this request.

So what you see logged in your raw server access log is the actual request line sent by the client, including HTTP method (GET, HEAD, PUT, etc.) the URL-path, and the HTTP protocol version, as in your example log-line above.

If you'd like to see HTTP transactions in action, try the "Live HTTP Headers" add-on for Firefox/Mozilla browsers.

Jim

Karma

3:43 pm on Oct 20, 2009 (gmt 0)

10+ Year Member



Ok, well after some investigation I don't have access to the config with my host (shared server/separate IP).

The problem I'm having is that the Google 'crawler' keeps requesting (what I see as) the same page but each returns a different response/file size:

64.22.143.239 - 09/Oct/2009:04:27:42 - /widget/blue/ - 301 235
64.22.143.239 - 09/Oct/2009:04:27:42 - /widget/blue/ - 200 185
64.22.143.239 - 09/Oct/2009:04:27:42 - /widget/blue/ - 200 47325