homepage Welcome to WebmasterWorld Guest from 54.226.191.80
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Lots of interesting user agents
Bewenched




msg:4521720
 4:26 am on Nov 23, 2012 (gmt 0)

some are actually human, but a lot are not.

Java/1.6.0_37 86.163.57.28
Java/1.6.0_26
Java/1.6.0_04 64.124.140.174 (thefind?)
Java/1.7.0_09 67.221.35.135
TencentTraveler+4.0 (113.77.167.56)
xpymep.exe
Sogou+web+spider/4.0
Java/1.7.0_07 67.246.30.181
Motors/1.6.0+CFNetwork/609+Darwin/13.0.0
Microsoft+Windows+Network+Diagnostics 75.138.235.99
Willow+Internet+Crawler+by+Twotrees+V2.1
vBulletin+via+PHP
WWW-Mechanize/1.18 ( thefind 64.124.148.79)
Pixray-Seeker/2.0+(http://www.pixray.com/pixraybot;++crawler@pixray.com) 81.30.151.218
Google/2.0.1.10455+CFNetwork/548.0.4+Darwin/11.0.0
Feedfetcher-Google;+(+http://www.google.com/feedfetcher.html;+feed-id=992622134398468599)
-=*DriveDigitalGroup.com*=-+Dealerships+Websites+PPC+Inventory+Parts+Marketing
psbot/0.1+(+http://www.picsearch.com/bot.html)
Python-urllib/2.6
Yeti/1.0+(NHN+Corp.;+http://help.naver.com/robots/)
ee://aol/http
Google/2.0.0.10163+CFNetwork/485.13.9+Darwin/11.0.0
Microsoft+Office+Mobile+/14.0
sam-r390+UP.Browser/6.2.3.8+(GUI)+MMP/2.0 (50.57.206.198)
Mozilla/4.0
Feed::Find/0.07
Nessus (mcafee 161.69.30.158)
sam-r380+UP.Browser/6.2.3.8+(GUI)+MMP/2.0 174.141.208.104
linkdex.com/v2.0
Mozilla/3.0+(compatible;+NetPositive/2.2)
User-Agent:Mozilla/5.0+(SaidwotBot)
Mozilla/4.0+(compatible;+Synapse) 178.148.177.197
ImageHunter/2.4.0+CFNetwork/609+Darwin/13.0.0
OpenWebIndex/Nutch-1.5
Apache-HttpClient/UNAVAILABLE+(java+1.4) (tmobile getting an image)
oodlebot/1.0
HOT%20HD%20WPs/8.80.1+CFNetwork/609+Darwin/13.0.0 76.202.52.17
COMODOSpider/Nutch-1.2
Nutraspace/Nutch-1.2+(www.nutraspace.com)
Yahoo!+Slurp+China 110.75.173.196
Dalvik/1.4.0+(Linux;+U;+Android+2.3.5;+PC36100+Build/GRJ90)
Sogou+web+spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
DTX
RedLaser/4.1.0+CFNetwork/609+Darwin/13.0.0
crawler4j+(http://code.google.com/p/crawler4j/) (not google 89.164.163.72)
AndroidDownloadManager
Mozilla/5.0+()
SuperPagesUrlVerifyBot/1.0
Aghaven/Nutch-1.2+(www.aghaven.com)
Java/1.6.0_20 (colo4 207.210.192.0/18)
Motors/1.5.3+CFNetwork/609+Darwin/13.0.0
YahooCacheSystem
Google/2.0.1.10455+CFNetwork/609+Darwin/13.0.0 (myvzw.com)
MyShopanion/1.5+CFNetwork/609+Darwin/13.0.0 (iphone app)
Google/2.5.1.13455+CFNetwork/548.0.3+Darwin/11.0.0
AF_ID=<26865> (amazonnaws 107.22.5.145)
Mozilla/4.0+(compatible;) (FYI there was about 500 requests all different files and IPs, ALL WERE MILITARY)
http://www.checkprivacy.or.kr:6600/RS/PRIVACY_ENFAQ.jsp
facebookexternalhit/1.1+(+http://www.facebook.com/externalhit_uatext.php) (fyi, facebook appears spidering)
facebookplatform/1.0+(+http://developers.facebook.com)

[edited by: incrediBILL at 5:28 am (utc) on Nov 23, 2012]
[edit reason] disabled smilies [/edit]

 

GaryK




msg:4521750
 6:50 am on Nov 23, 2012 (gmt 0)

A quick query basically anything without Mozilla or Opera at the start of the string of my user agents database returns just over 19,000 similar types of user agents to the ones you posted. There's a lot of weird ones out there including everything from the hilarious to the obscene! I'd try to copy/paste them, but I'm afraid the forum software would suffer a total meltdown! :)

I think the best one, from January of 2000, is both funny and obscene: none of your f****** business!

lucy24




msg:4521756
 7:30 am on Nov 23, 2012 (gmt 0)

Hee. That list looks like my BrowserMatch list, only longer.

:: shuffling papers ::

BrowserMatch ^-?$ keep_out
BrowserMatch Ahrefs keep_out
BrowserMatch "America Online Browser" keep_out
BrowserMatch Clipish keep_out
BrowserMatch Covario keep_out
BrowserMatch CoverScout keep_out
BrowserMatch "Extreme Picture Finder" keep_out
BrowserMatch FairShare keep_out
BrowserMatch HTTrack keep_out
BrowserMatch "Jakarta Commons-HttpClient/3.1" keep_out
BrowserMatch "Java/" keep_out
BrowserMatch libcurl keep_out
BrowserMatch "MSIE [1-4]\." keep_out
BrowserMatch "Mozilla/[0-3]" keep_out
BrowserMatch NativeHost keep_out
BrowserMatch Python keep_out
BrowserMatch scanner keep_out
BrowserMatch TencentTraveler keep_out
BrowserMatch vcbot keep_out
BrowserMatch webcollage keep_out
BrowserMatch Wget keep_out
BrowserMatch Wikimpress keep_out
# comment-out following for link checker
BrowserMatch libwww-perl keep_out
BrowserMatch Yahoo keep_out


Goes without saying that I ran the linkchecker a few hours ago-- and forgot to comment-out that libwww-perl line. Luckily I was checking locally so only one link was affected.

not2easy




msg:4536275
 7:54 pm on Jan 15, 2013 (gmt 0)

I'm seeing a new weirdo in UAs:
"h t t p://www.google(dot)com" (fake referer)
"Mozilla/5.0 (X11; Linux x86_64; rv:2.0b9pre) Gecko/20110111 Firefox/4.0b9pre"
that got found with the "GET / HTTP/1.1" search of raw logs. "Firefox/4.0b9pre" was a beta version from late 2010, no clue about that other "0b9pre" in there. It came in on a blocked range anyway (109.206.179.nnn)but thought I would leave it here in case anyone else is looking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved