| 9:04 pm on Jan 26, 2013 (gmt 0)|
I have blocked all UAs containing "nutch" for well over 10 years without any adverse affect.
This generic bot can be used by any unaccountable agent for any unknown purpose, and the accountable ones should customize and rename so their bot UA reflects they are on the level IMO.
| 5:14 pm on Jan 31, 2013 (gmt 0)|
|Note to self... if I ever write a scraper, name it after something universally popular, like a record-breaking Ferrari. |
Scrapers are one thing but when RDNS points to that...
And as always: AS21844 220.127.116.11/15 ThePlanet.com Internet Services, Inc.
| 6:33 am on Feb 6, 2013 (gmt 0)|
Mozilla/5.0 (Windows;) NimbleCrawler 1.12 obeys UserAgent NimbleCrawler For problems contact: crawler@health
Mmmwell... For a given definition of "nimble", anyway ;)
| 2:38 am on Feb 9, 2013 (gmt 0)|
Verbatim-- or rather, litteratim-- again:
18.104.22.168 - - [08/Feb/2013:00:39:40 -0800] "GET /robots.txt HTTP/1.0" 200 1005 "-" "Web front page analyser. robots.txt complaint (email@example.com)"
I can't decide whether I do, or do not, want that to be a typo :(
| 10:34 pm on Feb 9, 2013 (gmt 0)|
Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :)
| 11:09 pm on Feb 9, 2013 (gmt 0)|
|Well, 204.236.128/17 is amazon aws and anything with a gmail address is automatically suspicious in my book... Kill it. :) |
It requested robots.txt. I allow *almost* everything to get robots.txt, even the Amazon ranges so when I looked at this post yesterday, I figured she did also.
| 10:09 am on Feb 10, 2013 (gmt 0)|
The question is academic, because it didn't ask for anything else after robots.txt. (I checked. I do have the range blocked.) And I didn't hear any complaints about it either.
| 7:34 pm on Feb 10, 2013 (gmt 0)|
Print and frame it! An AWS bot that obeys robots.txt! :)
| 12:56 am on Feb 11, 2013 (gmt 0)|
|Print and frame it! An AWS bot that obeys robots.txt! :) |
Chances are that my cat's talking to me will make sense, which to this days sounds MYAU to me.
BTW, Have anybody heard of reliable myau translator web service?
| 9:51 pm on May 12, 2013 (gmt 0)|
22.214.171.124 - - [12/May/2013:07:39:51 -0700] "GET /hovercraft/images/wormapple.jpg HTTP/1.1" 200 32565 "-" "rarely used"
I expect this is perfectly true.
:: detour to raw logs ::
126.96.36.199 - - [11/May/2013:19:20:16 -0700] "GET /rats/images/ourhouse/LivRm5.jpg HTTP/1.1" 301 600 "-" "rarely used"
188.8.131.52 - - [11/May/2013:19:20:16 -0700] "GET /boilerplate/sorry.html HTTP/1.1" 200 1441 "-" "rarely used"
Huh. Fancy that.
:: detour to confirm hunch that these are Ukrainian IPs ::
Nope. They're not even the same country. What gives?
| 7:37 pm on May 13, 2013 (gmt 0)|
184.108.40.206 is vodafone Ireland.
220.127.116.11 is SuddenLink US - all 75.n.n.n are (basically) Arin (USA, Canada etc).
So likely compromised machines on DSL lines running a scan of some kind.
| 9:04 pm on May 13, 2013 (gmt 0)|
Yeah, the 75.108 threw me because I personally know people there; it's one of the local ISPs. But the UA is, uhm, rarely seen ;)
| This 42 message thread spans 2 pages: < < 42 ( 1  ) |