Page is a not externally linkable
lucy24 - 11:30 pm on Sep 17, 2011 (gmt 0)
The last discussion I found about these guys, either by IP or by name (TrendMicro/ trendnet), was year ago. Has anything changed?
So far I've seen about half a dozen specific IPs. They're all in 150.70.n.n --and nobody from 150.70 doesn't fit this pattern-- so let's leave it generic. UA is always
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Apart from the routine roboticisms like picking up the same page twice, a minute or two apart, and a morbid fascination with the favicon-- sometimes without any accompanying page request-- there were a few that particularly caught my notice. Most of this is after the fact. My bad.
12:05:02 /{directory}/{file}.js
12:05:02 /favicon.ico
12:05:02 /{directory}/{file}.html
12:05:35 /{directory}/{file}.js
Better check that javascript file again; there's something hinky about it. Right.
11:04:32 /{directory}/{subdirectory}/{file}.css
I point this out because in one of those earlier threads, the poster said that their particular culprit didn't bother about css. This is the only time I found them picking up a css-- but to make up for it, they didn't pick up anything else in the area. Far as I know, they've never even been in that particular subdirectory.
20:25:22 /{directory2}/{subdirectory}/{file}.html
20:28:53 /favicon.ico
20:29:39 /favicon.ico
See above about morbid fascination. But fix that {directory2} in your minds.
19:41:10 /{directory2}/{subdirectory2}/{file}.html
19:41:56 /favicon.ico
19:42:06 /favicon.ico
These two visits (on different days) have one thing in common: not only is directory2 roboted out, they have no way of knowing that its subdirectories even exist. There's no index, either explicit (index.html) or automated. In each case, I had recently posted the link in a special-interest forum.
After verifying that the forum is roboted-out (yes, some visits to robots.txt are from humans :)) I went and had a chat with the site administrator. He did some snooping of his own and found that our friends at 150.70. have also been doing assorted "cold" downloads of files that you're not even supposed to know about unless you're logged in. (You can do the download if you paste in the address directly-- but only if you know the address. It's not something you could randomly make up without gleaning an enormous lot of 404s.)
I've saved the most interesting for last. I recently (re)installed piwik. This soon led to:
12:40:24 /piwik/piwik.php?action_name={buncha stuff edited out here, including reference to google.fr search}
12:40:27 /piwik/piwik.js
12:40:34 /favicon.ico
12:40:35 /{directory}/{file}.html
Note the time. After poring over* the whole day's logs I found a normal human visit from a completely different IP and UA which included:
11:21:35 /piwik/piwik.php?action_name={letter for letter the EXACT SAME CONTENT as above}
Note the time. Aren't these guys supposed to be checking for viruses? "Dear user: that site you visited an hour and a half ago was infected. Take immediate action or it may be too late."
* Brazen lie. I just used the text editor's Find function with appropriate RegEx.