keyplyr

msg:4539555 | 9:04 pm on Jan 26, 2013 (gmt 0) |
I have blocked all UAs containing "nutch" for well over 10 years without any adverse affect. This generic bot can be used by any unaccountable agent for any unknown purpose, and the accountable ones should customize and rename so their bot UA reflects they are on the level IMO.
|
blend27

msg:4540988 | 5:14 pm on Jan 31, 2013 (gmt 0) |
| Note to self... if I ever write a scraper, name it after something universally popular, like a record-breaking Ferrari. |
| Scrapers are one thing but when RDNS points to that... 67.18.54.176 (ferrari.websitewelcome.com) And as always: AS21844 67.18.0.0/15 ThePlanet.com Internet Services, Inc.
|
lucy24

msg:4542833 | 6:33 am on Feb 6, 2013 (gmt 0) |
Verbatim:
Mozilla/5.0 (Windows;) NimbleCrawler 1.12 obeys UserAgent NimbleCrawler For problems contact: crawler@health Mmmwell... For a given definition of "nimble", anyway ;)
|
lucy24

msg:4544027 | 2:38 am on Feb 9, 2013 (gmt 0) |
Verbatim-- or rather, litteratim-- again:
204.236.138.148 - - [08/Feb/2013:00:39:40 -0800] "GET /robots.txt HTTP/1.0" 200 1005 "-" "Web front page analyser. robots.txt complaint (norw.acd.inst@gmail.com)"
I can't decide whether I do, or do not, want that to be a typo :(
|
dstiles

msg:4544242 | 10:34 pm on Feb 9, 2013 (gmt 0) |
Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :)
|
keyplyr

msg:4544249 | 11:09 pm on Feb 9, 2013 (gmt 0) |
| Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :) |
| It requested robots.txt. I allow *almost* everything to get robots.txt, even the Amazon ranges so when I looked at this post yesterday, I figured she did also.
|
lucy24

msg:4544284 | 10:09 am on Feb 10, 2013 (gmt 0) |
The question is academic, because it didn't ask for anything else after robots.txt. (I checked. I do have the range blocked.) And I didn't hear any complaints about it either.
|
dstiles

msg:4544348 | 7:34 pm on Feb 10, 2013 (gmt 0) |
Print and frame it! An AWS bot that obeys robots.txt! :)
|
blend27

msg:4544401 | 12:56 am on Feb 11, 2013 (gmt 0) |
| Print and frame it! An AWS bot that obeys robots.txt! :) |
| Chances are that my cat's talking to me will make sense, which to this days sounds MYAU to me. web BTW, Have anybody heard of reliable myau translator web service?
|
lucy24

msg:4573348 | 9:51 pm on May 12, 2013 (gmt 0) |
109.78.198.49 - - [12/May/2013:07:39:51 -0700] "GET /hovercraft/images/wormapple.jpg HTTP/1.1" 200 32565 "-" "rarely used" I expect this is perfectly true. :: detour to raw logs ::
75.108.158.236 - - [11/May/2013:19:20:16 -0700] "GET /rats/images/ourhouse/LivRm5.jpg HTTP/1.1" 301 600 "-" "rarely used" 75.108.158.236 - - [11/May/2013:19:20:16 -0700] "GET /boilerplate/sorry.html HTTP/1.1" 200 1441 "-" "rarely used" Huh. Fancy that. :: detour to confirm hunch that these are Ukrainian IPs :: Nope. They're not even the same country. What gives?
|
dstiles

msg:4573663 | 7:37 pm on May 13, 2013 (gmt 0) |
109.78.198.49 is vodafone Ireland. 75.108.158.236 is SuddenLink US - all 75.n.n.n are (basically) Arin (USA, Canada etc). So likely compromised machines on DSL lines running a scan of some kind.
|
lucy24

msg:4573718 | 9:04 pm on May 13, 2013 (gmt 0) |
Yeah, the 75.108 threw me because I personally know people there; it's one of the local ISPs. But the UA is, uhm, rarely seen ;)
|
| This 42 message thread spans 2 pages: < < 42 ( 1 [2] ) |
|
|