lucy24 - 2:46 am on Jan 16, 2012 (gmt 0)
Whew. I was all set to start a new "oBot where art thou?" thread, but looks like I squeaked under by a few days. Do they give you three months to wake slumbering threads?
Did any of you try the UA links? The long version
leads to IBM's Search page, with parameters filled in:
www-01.ibm.com/common/ssi/apilite?infotype=PM&infosubt=AB&doctype=XSO_* or XMS_* or XMT_* or XSE_* or XIC_* or XIA_* or XSS_* or XSF_* or XSD_* or XEN_* or XBU_*&lastdays=1825&ctvwcode=US&appname=GTSE_GT_GT_USEN_CS&additional=summary&contents=C0_CST and keeponlit
Goodness me. The only thing missing from the resultant search results is any information about oBot. A polite inquiry to IBM asking whether this is their robot has so far met with, hm, polite silence.
The short version ...
... is utterly fascinating, because I tried it just hours ago and got the Microsoft server's "ain't no such page" screen. It must be in a better mood now, because there's a bona fide "What is oBot?" page. Says they:
- user-agent: Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)
- our IP ranges 206.253.224.x or 194.153.113.x
Hm, don't see the Long Version in there anywhere, do you? But they came visiting from the very same IP. And thanks, IBM, for the heads-up about 194.153. Wouldn't have known that.
I have never met this robot before in my life. But they must know me, because their first visit consisted entirely of requests for HEADs of the image files that go with my front page. Not the current files that go with the current front page; datestamps tell me their previous visit can't have been later than December 2010 (not a typo). Some of them happen to still exist, though no longer linked to the front page. For the rest, they came back a few hours later and went away still unsatisfied.
A week later they were back with a fresh shopping list. This was an educational visit for me. First I learned that I must have a Unix server, because they asked for a couple of lower-case files whose real names are Title Case, so they got nothing but 404s. No more futzing about with HEAD; this time they asked for the whole thing.
After that they swung by my front page-- the current one-- and got up to speed on the images. (Which, incidentally, they always requested with the correct casing.)
A minute later* they came dashing back from the parking lot, apparently having overlooked the last two items on the list. First stop: an utterly random painting that I've never even bothered to make a page for. It was moved from its original location months ago, so the 404 would seem reasonable... except that the said original location was also roboted-out** months ago. A fact they must surely have noticed, since by this time they'd read robots.txt three separate times.
Second stop: an html file that they would have gotten handily if only they'd put it in Title Case. It happens to be the parent file of the two 404s they got earlier-- meaning that they must have picked up it, too, on their previous visit.
* Exactly a minute, as a matter of fact. Well, maybe it's coincidence.
** Someone hereabouts brilliantly suggested that as an alternative to redirecting forever, I could simply robot-out the nonexistent directories.