Msg#: 4411604 posted 11:53 pm on Jan 27, 2012 (gmt 0)
Long story short: Jury's still out. It behaved in that it asked for robots.txt and when that was 403'd, it went for root.
Long-winded details: I usually generate a generic (blanket Disallow) robots.txt and make it available to all but a chosen few major SEs. As it turns out, I recently I 403'd that step re some hosts (e.g., code.google.com) that rarely ask for it, in order to cut down on needless rerererewriting. Now, if/when ldspider comes back, it'll be able to read the no-frills file after which I'll know if it heeds it.