Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies


yes, them again

7:39 am on Dec 25, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
votes: 289

Can someone point to a recent explanation of who Thunderstone are and what they want? They're not getting it from me; I'm just curious. - - [24/Dec/2013:07:56:35 -0800] "GET /robots.txt HTTP/1.1" 301 513 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)" - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 301 492 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)" - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 200 870 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"

Possibly they thought they could hide among the thousands of lines of checklink hits (brand-new site, not yet visible to the public) in the same day's logs.

robots.txt currently says, in full,

User-Agent: W3C-checklink

User-Agent: *
Disallow: /

It goes beyond "Which part of {asterisk} did you not understand?"* They seem to go out of their way to look for roboted-out files and under-the-radar sites. Compare this thread [webmasterworld.com] from early 2011. I've never seen them on my "real" site, only on assorted backwaters.

Cursory Forums search suggests they've been at it-- whatever "it" is-- since 2001**. Their current home is

* Or, for that matter, "Which part of 301 did you not understand?" Note the pattern of redirects. My host's logs can be a bit hiccupy, so it's not even certain that they asked for robots.txt before asking for the front page-- currently the host's "coming soon" default, so neener-neener. What is certain is that they never bothered to follow the redirect.
** I had no idea there was such a thing as a lapsed or inactive member. That's how long ago 2001 was.
4:22 pm on Jan 1, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
votes: 2

Not sure how I missed this.

In brief a 3rd party harvester.
from the main page of their site:
Thunderstone Software LLC is an independent R&D company that has been providing high-performance state-of-the-art solutions to intelligent information retrieval and management problems for over 33 years. Our flagship product, Texis™, is the most comprehensive text retrieval and publishing software available. In one package Texis provides every full-text, SQL, multimedia management, and dynamic publishing operation needed for an enterprise search application.
end of quote

Pretty straight-forward and not sure why you need an explanation.

More than a decade ago, T-h-u-n-d-e-r-s-t-o-n-e, ran primarily from a Road Runner IP. Perhaps and despite 3rd parties utilizing the software, the orgs server was utilized at that time.
Today, it seems the users IP is the active server for the software.

BTW, T-h-u-n-d-e-r-s-t-o-n-e failed to honor robots.txt and was void of any comprehension and/or protocols more than a decade and it is likely those practices of disregard continue.
6:28 pm on Jan 1, 2014 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
votes: 75

I use a Thunderstone product as my site search. Great stuff.

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members