homepage Welcome to WebmasterWorld Guest from 23.20.91.134
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Default User Agents of Programming Libraries and Command Line Tools
Resource page for common user agents
incrediBILL




msg:3626005
 10:00 pm on Apr 13, 2008 (gmt 0)

The purpose of this thread is to collect a list common default HTTP user agent names used by various programming libraries and command line tools. Many of these user agents may actually be used in a real spider but for whatever reason the default user agent string wasn't reset although most are easily changed to reflect the spider name.

Hopefully this list will help people understand what tools are being used which may provide some insights into the spider's purpose, as well as being a useful quick resource guide in the future.

Let me kick off this list with a few entries of my own:

Jakarta Commons-HttpClient
Example: "Jakarta Commons-HttpClient/3.0.1"
HTTP client protocol library, see Apache.org for more details

Snoopy
Example: "Snoopy v1.2.3"
Snoopy v1.2.3 it appears to be a PHP class with one version on Source Forge (see the snoopy project) and it's definitely included in Wordpress.

Wget
Example: "Wget/1.10.2 (Red Hat modified)"
GNU wget (wget) is a freely available network utility to retrieve files from the World Wide Web, using HTTP (Hyper Text Transfer Protocol) and FTP (File Transfer Protocol), the two most widely used Internet protocols."

libwww-perl
Example: "libwww-perl/5.805"
This is a general-purpose application library for retrieving HTTP documents used by PERL scripts typically from Linux servers. This particular library is often associated with hacking attempts and botnet attacks [webmasterworld.com] and should be blocked in general.

curl or libcurl-agent
Example: "curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
Example: "libcurl-agent/1.0"
There are several variations on this user agent but it's all related to cURL which is a command line tool for transferring files with URL syntax often called from scripts on Linux servers.

I'll add more later and please feel free to add others to this list.

NOTE: This isn't a discussion thread, it's an information posting thread only so if you feel the need to discuss one of the user agents listed in detail please start a new thread.

[edited by: incrediBILL at 8:39 pm (utc) on July 14, 2009]

 

Ocean10000




msg:3626013
 10:17 pm on Apr 13, 2008 (gmt 0)

Microsoft-ATL-Native
Example: "Microsoft-ATL-Native/7.00"
MSDN [msdn2.microsoft.com] - The primary mission of ATL Server is to provide support for responding to HTTP requests, but ATL Server also provides client-side support. This functionality allows launching of HTTP requests and handles receiving the resulting HTTP responses. With the exception of basic HTTP response header parsing, the ATL Server HTTP client support does not perform any HTTP response parsing or rendering.

Microsoft-CryptoAPI
Example: "Microsoft-CryptoAPI/6.0.5744.16384"
MSDN [msdn2.microsoft.com] - The Microsoft CryptoAPI allows developers to build cryptographic security into their applications by providing a flexible set of functions to encrypt or digitally sign data.

Microsoft-WebDAV-MiniRedir
example: "Microsoft-WebDAV-MiniRedir/6.0.5744"
Microsofts implementation of Web Distributed Authoring and Versioning (WebDAV), which is in XP and all future version of windows.

Microsoft URL Control
Example: "Microsoft URL Control - 6.00.8877"
This is one of Microsoft's generic com library for web requests which Visual Basic and any other language which can use com objects can access. And documentation for changing the default User-Agent header is lacking when I last checked, so most often never changed.

MS Web Services Client Protocol
Example: "Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50727.42)"
This is the default User-Agent for webservices connecting code generated by the default tools in Dot.net. And can easily be changed but often is forgoten about in the rush to finish off the webservice linking software.

[edited by: Ocean10000 at 10:24 pm (utc) on April 13, 2008]

blend27




msg:3626026
 10:44 pm on Apr 13, 2008 (gmt 0)

Coldfusion MX7, Scheduled Task via CF Administrator for CFHTTP

Default UA

Mozilla/2.0+(compatible;+MSIE+3.0B;+Windows+NT)

Hobbs




msg:3626244
 8:33 am on Apr 14, 2008 (gmt 0)

LWP::Simple

LWP=Library for WWW in Perl (see libwww-perl above)
A Perl module used for the simple fetching of a page or just the headers
Example or UA seen in logs: "LWP::Simple/5.810"

System
redhat



msg:3626663
 2:15 pm on Apr 14, 2008 (gmt 0)

The following message was cut out to new thread by incredibill. New thread at: search_engine_spiders/3626661.htm [webmasterworld.com]
11:32 am on April 14, 2008 (PST -8)

incrediBILL




msg:3626700
 8:13 pm on Apr 14, 2008 (gmt 0)

Java
Example: "Java/1.5.0_08"
The default user agent of the Java programming language.

Python-urllib
Example: "Python-urllib/2.5"
The default user agent for the Python programming language.

POE-Component-Client-HTTP
Example: "POE-Component-Client-HTTP/0.65 (perl; N; POE; en; rv:0.650000)"
Another library for PERL that provides an asynchronous event driven HTTP user agent.

Indy Library
Example: "Mozilla/3.0 (compatible; Indy Library)"
Appears to be the Internet Direct Library for Borland which is frequently involved in spamming and email harvesting spiders.

phranque




msg:3626776
 9:23 pm on Apr 14, 2008 (gmt 0)

lwp-request
Example: "lwp-request/2.08"
Simple command line WWW user agent based on libwww-perl.

Mokita




msg:3628166
 1:40 pm on Apr 16, 2008 (gmt 0)

incrediBILL wrote:
There are several variations on this user agent but it's all related to cURL

PycURL for instance:
"PycURL is a Python interface to libcurl. PycURL can be used to fetch objects identified by a URL from a Python program, similar to the urllib Python module."

incrediBILL




msg:3628412
 5:54 pm on Apr 16, 2008 (gmt 0)

VB Project
Example: "VB Project"
The default user agent name when Visual Basic projects access the internet

incrediBILL




msg:3629300
 5:29 pm on Apr 17, 2008 (gmt 0)

HTMLParser
Example: "HTMLParser/1.6"
This appears to be a Java library used to parse HTML.

incrediBILL




msg:3629525
 9:14 pm on Apr 17, 2008 (gmt 0)

PECL::HTTP
Example: "PECL::HTTP/1.6.0RC1"
PHP HTTP extension library called PECL found on php.net

incrediBILL




msg:3630112
 5:33 pm on Apr 18, 2008 (gmt 0)

RPT-HTTPClient
Example: "Mozilla/4.5 RPT-HTTPClient/0.3-2"
This appears to be a Java client library

Hobbs




msg:3633554
 9:10 pm on Apr 23, 2008 (gmt 0)

CLDC Connected Limited Device Configuration

Example: "Mozilla/5.0 (SymbianOS/9.2; U; Series60/3.1 NokiaN95/20.0.015; Profile/MIDP-2.0 Configuration/CLDC-1.1 )"

API and VM for limited services devices like mobiles

incrediBILL




msg:3635436
 9:38 pm on Apr 25, 2008 (gmt 0)

TeamSoft WinInet Component
Example: "TeamSoft WinInet Component"
Appears to be an add-on component for Delphi

Hobbs




msg:3637995
 8:54 pm on Apr 29, 2008 (gmt 0)

WWW::Mechanize
Example: "WWW-Mechanize/1.34"
Perl module that emulates everything a visitor can do, including filling forms, signing up, navigating links.. A usability testing tool also a potential site scraper.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved