Forum Moderated by: open

Crawler, Spider, and User Agent ID


Forum to identify search engine spiders and user agents

 
Thread SubjectMessagesStarted byLast Message
  What are those guys at Slurp! China on?
SlurpConfirm404 URL oddities
3 AlexK 6:38 pm Mar 2, 2006
  Y!j-bsc/1.0
New Yahoo spider?
3 bobothecat 2:01 am Mar 2, 2006
  Google and Drupal
3 Sheen1 4:45 pm Feb 28, 2006
  Robot - Meaningful Machines
ultra agressive spider
5 ziegast 9:37 am Feb 27, 2006
  NextopiaBOT
Did not check robots.txt
4 bose 5:13 am Feb 26, 2006
  Is Google dealing with Yahoo (Inktomi)?
Google Sitemaps file spidered by Yahoo
5 extranjero 4:57 am Feb 26, 2006
  Slurpy Verifier/1.0
Has anyone seen this?
4 selomelo 4:41 am Feb 26, 2006
  InfoPath.1. what is it?
Earlier thread died on speculation
3 arnarn 4:37 am Feb 26, 2006
  YahooYSMcm/2.0.0
No robots.txt
4 GaryK 1:01 pm Feb 24, 2006
  tailrank
2 keyplyr 5:37 am Feb 18, 2006
  Slew of new Slurp IP's?
6 MrSpeed 4:48 pm Feb 16, 2006
  Is the government spying on me?
If so at least they read robots.txt!
8 GaryK 4:24 am Feb 16, 2006
  Was this a fake "msnbot"? (Non-MS IP; no robots.txt; triggered traps)
Does Microsoft sell/license their bot to others?
11 Pfui 4:54 pm Feb 15, 2006
  Titanium 2005 (4.02.01)
Is this Panda Antivirus Titanium?
5 GaryK 4:33 am Feb 14, 2006
  PHP files with "?" string
Are they indexed?
3 halbesma 11:14 pm Feb 13, 2006
  MSIE Crawler
What is it?
4 RichTC 11:10 pm Feb 13, 2006
  Jeeves
3 wilderness 9:19 pm Feb 13, 2006
  Pic-grabber? *internetserviceteam.com showing up in log
completely ignored robots.txt and now I don't know how to stop it!
3 Dottie_Matrix 6:44 pm Feb 13, 2006
  cloaked spider from 66.220.7.* and 66.220.20.*
relentless .html + .txt spider
4 Hetta 3:43 pm Feb 12, 2006
  Google attempting crawl with invalid Mozilla Uesr-agent
I hope this is just a G employee just fooling around
10 jdMorgan 7:24 am Feb 9, 2006
  deleted old pages but spiders see it as errors as it's no longer live
6 hulahoop 7:13 am Feb 9, 2006
  New bot Java/1.5.0_06 grabs all pages
grabbed all pages from 2 different domains
5 privacyman 7:04 am Feb 9, 2006
  GigaBot - which engine sent this crawling?
21 Event_King 7:30 pm Feb 8, 2006
  Burf
Ignore Robot.txt
9 frontpage 1:09 pm Feb 6, 2006
  Revisiting MSNbot and Google Translator
7 volatilegx 10:46 pm Feb 4, 2006