Forum Moderated by: open
Forum to identify search engine spiders and user agents
| Thread Subject | Messages | Started by | Last Message | ||||
|---|---|---|---|---|---|---|---|
| PigBlock No robots.txt |
4 | GaryK | 7:14 pm Aug 1, 2006 | ||||
| Purpose of this Crawler/Bot? |
7 | DXL | 7:04 pm Aug 1, 2006 | ||||
| The new Y! slurp |
3 | BlackTulip | 5:36 pm Aug 1, 2006 | ||||
| PediaSearch.com Crawler |
9 | keyplyr | 7:19 am July 30, 2006 | ||||
| SevenTwentyFour/LinkWalker - New Owner, Mission Watch out! Brand name surveillance via LinkWalker. |
4 | Wizcrafts | 5:00 am July 30, 2006 | ||||
| lwp::simple/5.803 pretending to be yahoo? |
6 | jake66 | 7:53 pm July 27, 2006 | ||||
| "NutchCVS" (again) but from penguin26.parc.xerox.com No robots.txt |
3 | Pfui | 9:02 am July 26, 2006 | ||||
| "GT::WWW/1.026" from .reverse.layeredtech.com No robots.txt |
7 | Pfui | 4:16 am July 26, 2006 | ||||
| MetagerBot/0.8-dev (MetagerBot; http://metager.de; ) Note space before close paren |
8 | Pfui | 7:19 pm July 25, 2006 | ||||
| Downloads from a blank UA? |
23 | keyplyr | 6:26 am July 25, 2006 | ||||
| HA! No User Agent for You! These just flat-out annoy me! |
10 | GaryK | 2:16 am July 25, 2006 | ||||
| Crawler/1.0 http://elibron.com No robots.txt |
5 | GaryK | 7:24 pm July 24, 2006 | ||||
| Mozilla/5.0 (compatible;MAINSEEK BOT) No robots.txt |
3 | GaryK | 3:06 pm July 24, 2006 | ||||
| Mozilla/5.0 (compatible; robtexbot/1.0; http://www.robtex.com/ ) Note space before close paren. Also: no robots.txt; uses site URL in ref |
10 | Pfui | 1:50 am July 24, 2006 | ||||
| "teoma agent1" from directhit.com -- no robots.txt |
2 | Pfui | 11:17 pm July 22, 2006 | ||||
| "research-spider" from .cs.brown.edu |
2 | Pfui | 11:16 pm July 22, 2006 | ||||
| "Entrieva/1.0" -- no robots.txt |
2 | Pfui | 10:26 pm July 22, 2006 | ||||
| 000s of Truncated Page Requests from Many IPs [3] ( 1 2 3 ) |
82 | jomaxx | 10:59 pm July 20, 2006 | ||||
| Yahoo! Crawlers - A response from Yahoo! Search Response from Yahoo! |
9 | Yahoo_Mike | 8:33 am July 18, 2006 | ||||
| How to ban (compatible ; type requests Note space between compatible and semicolon[2] ( 1 2 ) |
40 | larryhatch | 3:06 am July 18, 2006 | ||||
| Googlebot Google but not Googlebot |
4 | vortech | 4:55 pm July 16, 2006 | ||||
| server2.attributor.com |
12 | Cromicon | 4:39 pm July 16, 2006 | ||||
| Naughty Yahoo User Agents Please post them here[2] ( 1 2 ) |
45 | GaryK | 5:04 pm July 14, 2006 | ||||
| New UA: Mozilla/4.0 (compatible; mark.blonin.bot;) has a weird Referer |
6 | Mokita | 6:44 pm July 13, 2006 | ||||
| sna-0.0.1 mikeelliott@hotmail.com (nee mikemuzio@msn.com) Spider as Snoopy Spammer |
2 | Pfui | 1:13 am July 13, 2006 |