From the name, one would swear it had been around forever. But I never set eyes on it until late July (2023).
IP: 182.22.30 (Yahoo Japan)
UA:
Mozilla/5.0 (compatible; Y!J-WSC/1.0; +https://yahoo.jp/3BSZgF)
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko; compatible; Y!J-WSC/1.0; +https://yahoo.jp/3BSZgF) Chrome/113.0.0.0 Safari/537.36
robots.txt: yes, appears to be compliant
headers: humanoid
The first (shorter) UA is for pages and robots.txt; the longer one is for scripts and stylesheets. It follows a slightly unusual pattern, where each page request is accompanied by at most one script or stylesheet, giving the HTML as referer. (I think most search engines do this now, probably wisely, since some sites will serve different stylesheets under the same name.) If the first stylesheet associated with a page is something it has previously picked up, it gets something further down the list, if any. To date I haven’t seen it pick up other types of supporting files such as images or fonts.
“Appears to be” compliant because, thanks to showing up out of nowhere with humanoid headers, it never went through the usual access tests. But so far it hasn't requested anything from a roboted-out directory, notably including the analytics script that is attached to all pages.
Oh, and the URL in the UA redirects to an information page in Japanese. (Forgot to check this before posting.) I have no Japanese-language content.
:: business with
lang ?= ?"(?!en|iu|de|kl|la|fr) to ensure I'm not talking out of my hat ::
Oh, look at that. One occurrence of
<i lang = "ja">kami-shimo</i> and three of
<i lang = "ja">sake</i>, all in a single book. It would be entertaining if this had proved to be the very first page the robot homed in on, but this is not the case.
<tangent>
Long ago I used this very page for experimenting with G*** translate. I learned that they don’t look at "lang" tags, and hence come to grief over “sake”.
</tangnent>