Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies


IBM iss.net IPs



12:18 am on Oct 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Old bot [google.com...] with new name. Or new bot with old name...

This month, from mothership IBM's Internet Security Systems (iss.net) c/o Germany: [projecthoneypot.org...]
Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)

14:12:26 / GET
14:12:28 / HEAD
14:12:29 / HEAD
14:12:30 / GET
14:12:34 / HEAD

robots.txt? NO

Last month and prior, the same five-hit, GET-HEAD, no-robots pattern from sibling IBM Deutschland IPs on two different sites using two variations: [projecthoneypot.org...]
Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/) [projecthoneypot.org...]
Mozilla/5.0 (compatible; oBot/2.3.1; +http://www-935.ibm.com/services/us/index.wss/detail/iss/a1029077?cntxt=a1027244)

12:50:16 / GET
12:50:20 / HEAD
12:50:22 / HEAD
12:50:23 / GET
12:50:27 / HEAD

robots.txt? NO


4:59 am on Oct 20, 2011 (gmt 0)

10+ Year Member

It's been in all my sites within the last 24 hours. Very busy bot!


5:12 am on Oct 20, 2011 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month "Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)"

robots.txt: no

Got 403s


2:46 am on Jan 16, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

Whew. I was all set to start a new "oBot where art thou?" thread, but looks like I squeaked under by a few days. Do they give you three months to wake slumbering threads?

Did any of you try the UA links? The long version


leads to IBM's Search page, with parameters filled in:

www-01.ibm.com/common/ssi/apilite?infotype=PM&infosubt=AB&doctype=XSO_* or XMS_* or XMT_* or XSE_* or XIC_* or XIA_* or XSS_* or XSF_* or XSD_* or XEN_* or XBU_*&lastdays=1825&ctvwcode=US&appname=GTSE_GT_GT_USEN_CS&additional=summary&contents=C0_CST and keeponlit

Goodness me. The only thing missing from the resultant search results is any information about oBot. A polite inquiry to IBM asking whether this is their robot has so far met with, hm, polite silence.

The short version ...


... is utterly fascinating, because I tried it just hours ago and got the Microsoft server's "ain't no such page" screen. It must be in a better mood now, because there's a bona fide "What is oBot?" page. Says they:

- user-agent: Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)
- our IP ranges 206.253.224.x or 194.153.113.x

Hm, don't see the Long Version in there anywhere, do you? But they came visiting from the very same IP. And thanks, IBM, for the heads-up about 194.153. Wouldn't have known that.

I have never met this robot before in my life. But they must know me, because their first visit consisted entirely of requests for HEADs of the image files that go with my front page. Not the current files that go with the current front page; datestamps tell me their previous visit can't have been later than December 2010 (not a typo). Some of them happen to still exist, though no longer linked to the front page. For the rest, they came back a few hours later and went away still unsatisfied.

A week later they were back with a fresh shopping list. This was an educational visit for me. First I learned that I must have a Unix server, because they asked for a couple of lower-case files whose real names are Title Case, so they got nothing but 404s. No more futzing about with HEAD; this time they asked for the whole thing.

After that they swung by my front page-- the current one-- and got up to speed on the images. (Which, incidentally, they always requested with the correct casing.)

A minute later* they came dashing back from the parking lot, apparently having overlooked the last two items on the list. First stop: an utterly random painting that I've never even bothered to make a page for. It was moved from its original location months ago, so the 404 would seem reasonable... except that the said original location was also roboted-out** months ago. A fact they must surely have noticed, since by this time they'd read robots.txt three separate times.

Second stop: an html file that they would have gotten handily if only they'd put it in Title Case. It happens to be the parent file of the two 404s they got earlier-- meaning that they must have picked up it, too, on their previous visit.

* Exactly a minute, as a matter of fact. Well, maybe it's coincidence.
** Someone hereabouts brilliantly suggested that as an alternative to redirecting forever, I could simply robot-out the nonexistent directories.


4:55 am on Jan 16, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

Well, dang and blast. Here I'd already gone and blocked them on robots.txt grounds, before ever consulting the horse's mouth [cobion.com] via their own links.

There we learn among other things:
IBM Proventia Web Filter database categories
<snip, snip>
Religion: Includes Web sites with religious content, information about the five main religions, and religious communities that have emerged out of these religions.
Sects: This category contains sites about sects, cults, occultism, Satanism etc.

Tough luck, all you Sikhs, Parsees and Baha'is. Guess you're just cults.

Featured Threads

Hot Threads This Week

Hot Threads This Month