Can someone point to a recent explanation of who Thunderstone are and what they want? They're not getting it from me; I'm just curious.
206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET /robots.txt HTTP/1.1" 301 513 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"
206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 301 492 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"
206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 200 870 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"
Possibly they thought they could hide among the thousands of lines of checklink hits (brand-new site, not yet visible to the public) in the same day's logs.
robots.txt currently says, in full,
User-Agent: W3C-checklink
Disallow:
User-Agent: *
Disallow: /
It goes beyond "Which part of {asterisk} did you not understand?"* They seem to go out of their way to look for roboted-out files and under-the-radar sites. Compare
this thread [webmasterworld.com] from early 2011. I've never seen them on my "real" site, only on assorted backwaters.
Cursory Forums search suggests they've been at it-- whatever "it" is-- since 2001**. Their current home is 206.183.0.0/19.
* Or, for that matter, "Which part of 301 did you not understand?" Note the pattern of redirects. My host's logs can be a bit hiccupy, so it's not even certain that they asked for robots.txt before asking for the front page-- currently the host's "coming soon" default, so neener-neener. What is certain is that they never bothered to follow the redirect.
** I had no idea there was such a thing as a lapsed or inactive member. That's how long ago 2001 was.