homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / New To Web Development
Forum Library, Charter, Moderators: brotherhood of lan & mack

New To Web Development Forum

Detecting spiders

 1:35 am on Nov 1, 2002 (gmt 0)

I've read numerous bits that people know where, when, how, and by whom their sites are being spidered. How does one know and monitor this information?



 4:07 am on Nov 1, 2002 (gmt 0)


By reading the server logs for the domain in question. Most decent web hosting packages will provide raw logs (a line-by-line list of every request for a document, image, script, etc. requested by a browser or spider) and also some sort of log analysis tool that takes those raws logs and crunches them into more usable forms, such as "List of browsers used by visitors", "List of referers", etc.

Sites hosted on "freebie" accounts like GeoCities usually have no logs available. An alternative is to install a link on your page(s) to a remote "hit counting" and logging service. These hit counters will log basic summary information about visitors to your site. Some are free, and some charge a monthly fee. An example would be webstats (do a search for many more). There may be privacy issues involved for both you and your vistors - I would urge you to read the Terms of Service thoroughly before signing up with one of these services.



 11:14 pm on Nov 1, 2002 (gmt 0)

OK, I've got the server logs. I see the IP's but don't know how to tell who they are.


 11:28 pm on Nov 1, 2002 (gmt 0)

There's a good resource over on Search Engine World, WebmasterWorld's sister site:

Also, see the Spider Knowledge Base:


 11:49 pm on Nov 1, 2002 (gmt 0)

Great, I've got it. But is there a way to automate the search in these large daily log files so to find and report on spider activity? The web stats package that is available to me (DeepMetrix LiveStats) seems not to report on crawling, but every other kind of stat imaginable.


Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / New To Web Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved