Forum Moderators: DixonJones
Is there a tutorial somewhere I could read to help me understand this and work out how much of a problem it is if at all.
[edited by: tedster at 2:47 am (utc) on April 21, 2004]
[edit reason] remove specifics [/edit]
[20/Apr/2004:17:41:48 +1000]//this is the date and time that the page was requested
"GET /html/home.htm HTTP/1.0" 200 1789 //this is the page requested on your site
"http://www.example.com/" //this is the referrer - the page that the user came from
"Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)" //this is the browser they were using
[edited by: tedster at 2:48 am (utc) on April 21, 2004]
[edit reason] remove specifics [/edit]
My raw logs are pretty big too, but I downloaded this little program called Textwiz from somewhere (I forget where), which counts instances of a specific phrase within a text doc, and this is useful for seeing how many times a particular page was a referrer for instance.
Helen.
You might find this thread on how to track visitors [webmasterworld.com] an interesting read if you havn't seen it already.
"GET /html/home.htm HTTP/1.0" 200 1789 //this is the page requested on your site
To be technical, only the "/html. . . htm" part is the actual page/file requested (and its path). The rest of the line provides other details about the visitor's request (and the results):
"GET" is the method ("GET" is by far the most common;but there is also "POST" [often used for form input] and "HEAD" [used by spiders checking links, metainformation...])
"HTTP/1.0" is the version of the HTTP protocol the visitor used to make the request (you should also see plenty of 1.1, its "successor")
"200" is the "server status code". It tells you what sort of response the server made to the user's request.
These can be roughly summarized thus:
1xx -informational (rare! not used at all in HTTP/1.0)
2xx -successful ["200" should be the most common code you see]
3xx -redirect, 4xx -client error, 5xx -server error
(see [helpwithpcs.com ] for a fuller list)
"1789" is the number of bytes [size of the file]the user received (Since a "HEAD" request [and sometimes others] downloads no bytes, it will read - )
"Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)" //this is the browser they were using
Any decent tracking analysis software will interpret this for you, but it does help to understand it and be able, at least occasionally, to look directly at the logs.
I miss those unix commands - is there an emulator for Windows?
Well there are variety of solutions from virtual machines that let you run Unix from within Windows (that lets you have the complexity of Unix with the instability of Windows) to Windows ports for many shell commands. I'm pretty sure you can even get emacs.
There's a sourceforge project called unxutils (note the spelling) that has grep, gawk, sed, sh, zsh and more.
Morten Jorgensen (sp?) has an excellent grep tool as well. Try searching for "grep v2.4" or "grep windows".
There are other grep-like tools from the expensive ones that I have not tried (powergrep and the like) to freebies like BKReplacem (odd but very powerful once you get to know it).
I found this string...
(insert IP address here) - - [09/May/2004:08:41:15 -0500] "GET / HTTP/1.0" 200 2572 "-" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 7.0; Windows NT 5.1; FunWebProducts; .NET CLR 1.1.4322)"
Can anyone tell what it was they accessed?
Usually after "GET" it tells me the file they hit, this time nothing....
Also, usually after / HTTP/ it says 1.1"
This time it says 1.0?
What would cause that?
I am just trying to understand, this all too interesting to me! :-)
Any ideas?
(insert IP address here) - - [09/May/2004:08:41:15 -0500] "GET / HTTP/1.0" 200 2572 "-" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 7.0; Windows NT 5.1; FunWebProducts; .NET CLR 1.1.4322)"Can anyone tell what it was they accessed?
Usually after "GET" it tells me the file they hit, this time nothing....
Also, usually after / HTTP/ it says 1.1"
This time it says 1.0?
"GET / HTTP/1.0" is the request the user-agent sent. Here it is a GET request (as opposed to, say, a POST to script from a form) to /, ie the client requested plain [example.org...] .
The HTTP/1.0 is the client advertising the protocol and version it supports. Most modern web browsers are HTTP/1.1. Most bots are HTTP/1.0. Most proxies are still only HTTP/1.0 (Squid comes to mind). So a modern browser behing a Squid proxy would be advertised as HTTP/1.0. Note the user-agent string includes AOL 7.0 so I assume the user is behing AOL's proxy servers.
Jon.
So what your saying is they hit my front page?
[****xx.com...] ?
Cause my counter doesn't show any hits?