Forum Moderated by: open

Crawler, Spider, and User Agent ID


Forum to identify search engine spiders and user agents

 
Thread SubjectMessagesStarted byLast Message
Amazon
2 wilderness 1:10 am Oct 29, 2004
Jakarta Commons-HttpClient/2.0rc2
Anyone banning it?
2 guitaristinus 2:10 pm Oct 28, 2004
WebRescuer v0.2.4
Anyone know what this is?
3 Busynut 11:45 am Oct 28, 2004
spider took website down
4 ScottM 2:39 am Oct 28, 2004
SightQuest Bot/1.2
Went right for links page
3 idoc 1:53 am Oct 27, 2004
Definitive list of spiders.
5 lloyd 4:14 pm Oct 26, 2004
Cn-dot
CN-DOT Spider
2 Hollywood 11:08 pm Oct 25, 2004
Woodland Data Center OH
3 wilderness 11:10 am Oct 24, 2004
Gridzoom Throws Down the Gauntlet
gridBot/0.3alpha[2] ( 1 2 )
41 pendanticist 8:05 pm Oct 23, 2004
msnbot/0.3 hits the street
UA version apparently skipped from 0.11 to 0.3
6 jdMorgan 1:38 am Oct 22, 2004
DiamondBot
Apparently run by gator
6 seindal 11:51 pm Oct 21, 2004
Globalspec
engineering search engine
5 volatilegx 11:13 pm Oct 20, 2004
Jetbot/1.0
NOT obeying robots.txt
7 pendanticist 10:03 pm Oct 20, 2004
Google Spidering With No UA
2 volatilegx 4:16 pm Oct 18, 2004
tpiol.com new spider?
Whois?
3 guarez 3:17 pm Oct 18, 2004
Google Newsbot User Agent
Does it have its own, or come from specific IP range?
2 projectphp 1:28 pm Oct 18, 2004
ProloCrawler
2 volatilegx 1:37 am Oct 13, 2004
Bond, James Bond
2 volatilegx 3:26 pm Oct 12, 2004
Crawlzilla/1.0
Checked robots.
10 pendanticist 10:40 am Oct 12, 2004
Yahoo Crawlers
Which crawlers are from Yahoo?
3 jimh009 7:47 pm Oct 11, 2004
Megite
no robots.txt
2 pendanticist 3:44 pm Oct 11, 2004
Is Yahoo's Spider Inktomi?
2 sirkei 8:53 pm Oct 10, 2004
Wotbox/alpha0.6
5 wilderness 2:00 am Oct 8, 2004
stat statcrawler@gmail.com
just noticed this today...
8 BillyS 11:50 pm Oct 7, 2004
New Spider? statcrawler (at) gmail.com
no url given
2 uncle_bob 12:57 am Oct 3, 2004