homepage Welcome to WebmasterWorld Guest from 23.20.61.85
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
thunderstone
yes, them again
lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4633530 posted 7:39 am on Dec 25, 2013 (gmt 0)

Can someone point to a recent explanation of who Thunderstone are and what they want? They're not getting it from me; I'm just curious.

206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET /robots.txt HTTP/1.1" 301 513 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"
206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 301 492 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"
206.183.1.74 - - [24/Dec/2013:07:56:35 -0800] "GET / HTTP/1.1" 200 870 "-" "Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)"

Possibly they thought they could hide among the thousands of lines of checklink hits (brand-new site, not yet visible to the public) in the same day's logs.

robots.txt currently says, in full,

User-Agent: W3C-checklink
Disallow:

User-Agent: *
Disallow: /

It goes beyond "Which part of {asterisk} did you not understand?"* They seem to go out of their way to look for roboted-out files and under-the-radar sites. Compare this thread [webmasterworld.com] from early 2011. I've never seen them on my "real" site, only on assorted backwaters.

Cursory Forums search suggests they've been at it-- whatever "it" is-- since 2001**. Their current home is 206.183.0.0/19.


* Or, for that matter, "Which part of 301 did you not understand?" Note the pattern of redirects. My host's logs can be a bit hiccupy, so it's not even certain that they asked for robots.txt before asking for the front page-- currently the host's "coming soon" default, so neener-neener. What is certain is that they never bothered to follow the redirect.
** I had no idea there was such a thing as a lapsed or inactive member. That's how long ago 2001 was.

 

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4633530 posted 4:22 pm on Jan 1, 2014 (gmt 0)

Not sure how I missed this.

In brief a 3rd party harvester.
from the main page of their site:
Thunderstone Software LLC is an independent R&D company that has been providing high-performance state-of-the-art solutions to intelligent information retrieval and management problems for over 33 years. Our flagship product, Texis™, is the most comprehensive text retrieval and publishing software available. In one package Texis provides every full-text, SQL, multimedia management, and dynamic publishing operation needed for an enterprise search application.
end of quote

Pretty straight-forward and not sure why you need an explanation.

More than a decade ago, T-h-u-n-d-e-r-s-t-o-n-e, ran primarily from a Road Runner IP. Perhaps and despite 3rd parties utilizing the software, the orgs server was utilized at that time.
Today, it seems the users IP is the active server for the software.

BTW, T-h-u-n-d-e-r-s-t-o-n-e failed to honor robots.txt and was void of any comprehension and/or protocols more than a decade and it is likely those practices of disregard continue.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4633530 posted 6:28 pm on Jan 1, 2014 (gmt 0)



I use a Thunderstone product as my site search. Great stuff.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved