Forum Moderators: open

Message Too Old, No Replies

Deciphering log files

tracking bots

         

Mr_Busby

1:56 am on Aug 15, 2003 (gmt 0)

10+ Year Member



I've just discovered I can access my log files (wow!) so I can see when bots visit my site, but I have no idea what all the text means (therefore don't know what they're doing when they visit)

Can anyone point me to a resource which explains, or give me an idea of what the text can tell me about their activity, for example:

crawler14.googlebot.com - - [31/Jul/2003:23:03:08 +1000] "GET /robots.txt HTTP/1.0" 404 286 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" "-"
crawler14.googlebot.com - - [31/Jul/2003:23:03:11 +1000] "GET / HTTP/1.0" 200 10512 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" "-"

I have no idea what all this means, but I'm very interested to learn. Thanks.

fiestagirl

4:53 am on Aug 15, 2003 (gmt 0)

10+ Year Member



Well it goes something like this:
an agent from an ip address that resolves to crawler14.googlebot.com visited at 23:03:08 on 31/Jul/2003 and asked for a page called /robots.txt which was status 404. (Apparently you don't have a robots.txt on that site.) The user agent was Googlebot/2.1 (+http://www.googlebot.com/bot.html) and the referrer was null. (represented by the "-")

Same with the subsequent visit except the page returned a status 200 a-okay.

MonkeeSage

5:03 am on Aug 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ps. The numbers right after the return codes (404 and 200) is the size of the data that was downloaded -- 286 bytes on the error (404), 10512 bytes on the success (200).

Jordan

DavidT

4:50 pm on Aug 15, 2003 (gmt 0)

10+ Year Member



While we are at it.

If a search engine spider like Google's Mediapartners attaches this, "?sfgdata=4", to a page it requests can anyone recognise what it might mean? I don't use any sort of query strings like that.