Welcome to WebmasterWorld Guest from 23.22.79.235

Forum Moderators: brotherhood of lan & mack

Message Too Old, No Replies

Detecting spiders

   
1:35 am on Nov 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've read numerous bits that people know where, when, how, and by whom their sites are being spidered. How does one know and monitor this information?
4:07 am on Nov 1, 2002 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Jon_King,

By reading the server logs for the domain in question. Most decent web hosting packages will provide raw logs (a line-by-line list of every request for a document, image, script, etc. requested by a browser or spider) and also some sort of log analysis tool that takes those raws logs and crunches them into more usable forms, such as "List of browsers used by visitors", "List of referers", etc.

Sites hosted on "freebie" accounts like GeoCities usually have no logs available. An alternative is to install a link on your page(s) to a remote "hit counting" and logging service. These hit counters will log basic summary information about visitors to your site. Some are free, and some charge a monthly fee. An example would be webstats (do a search for many more). There may be privacy issues involved for both you and your vistors - I would urge you to read the Terms of Service thoroughly before signing up with one of these services.

HTH,
Jim

11:14 pm on Nov 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, I've got the server logs. I see the IP's but don't know how to tell who they are.
11:28 pm on Nov 1, 2002 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There's a good resource over on Search Engine World, WebmasterWorld's sister site:
[searchengineworld.com...]

Also, see the Spider Knowledge Base:
[webmasterworld.com...]

11:49 pm on Nov 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Great, I've got it. But is there a way to automate the search in these large daily log files so to find and report on spider activity? The web stats package that is available to me (DeepMetrix LiveStats) seems not to report on crawling, but every other kind of stat imaginable.

-Jon