Welcome to WebmasterWorld Guest from 18.104.22.168
Hopefully this list will help people understand what tools are being used which may provide some insights into the spider's purpose, as well as being a useful quick resource guide in the future.
Let me kick off this list with a few entries of my own:
Example: "Jakarta Commons-HttpClient/3.0.1"
HTTP client protocol library, see Apache.org for more details
Example: "Snoopy v1.2.3"
Snoopy v1.2.3 it appears to be a PHP class with one version on Source Forge (see the snoopy project) and it's definitely included in Wordpress.
Example: "Wget/1.10.2 (Red Hat modified)"
GNU wget (wget) is a freely available network utility to retrieve files from the World Wide Web, using HTTP (Hyper Text Transfer Protocol) and FTP (File Transfer Protocol), the two most widely used Internet protocols."
This is a general-purpose application library for retrieving HTTP documents used by PERL scripts typically from Linux servers. This particular library is often associated with hacking attempts and botnet attacks [webmasterworld.com] and should be blocked in general.
curl or libcurl-agent
Example: "curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
There are several variations on this user agent but it's all related to cURL which is a command line tool for transferring files with URL syntax often called from scripts on Linux servers.
I'll add more later and please feel free to add others to this list.
NOTE: This isn't a discussion thread, it's an information posting thread only so if you feel the need to discuss one of the user agents listed in detail please start a new thread.
[edited by: incrediBILL at 8:39 pm (utc) on July 14, 2009]
MSDN [msdn2.microsoft.com] - The Microsoft CryptoAPI allows developers to build cryptographic security into their applications by providing a flexible set of functions to encrypt or digitally sign data.
Microsofts implementation of Web Distributed Authoring and Versioning (WebDAV), which is in XP and all future version of windows.
Microsoft URL Control
Example: "Microsoft URL Control - 6.00.8877"
This is one of Microsoft's generic com library for web requests which Visual Basic and any other language which can use com objects can access. And documentation for changing the default User-Agent header is lacking when I last checked, so most often never changed.
MS Web Services Client Protocol
Example: "Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50727.42)"
This is the default User-Agent for webservices connecting code generated by the default tools in Dot.net. And can easily be changed but often is forgoten about in the rush to finish off the webservice linking software.
[edited by: Ocean10000 at 10:24 pm (utc) on April 13, 2008]
The default user agent for the Python programming language.
Example: "POE-Component-Client-HTTP/0.65 (perl; N; POE; en; rv:0.650000)"
Another library for PERL that provides an asynchronous event driven HTTP user agent.
Example: "Mozilla/3.0 (compatible; Indy Library)"
Appears to be the Internet Direct Library for Borland which is frequently involved in spamming and email harvesting spiders.
There are several variations on this user agent but it's all related to cURL
PycURL for instance:
"PycURL is a Python interface to libcurl. PycURL can be used to fetch objects identified by a URL from a Python program, similar to the urllib Python module."