These Spiders/crawlers etc. How controllable are they exactly? I mean would one be able to send it out to say just one engine?
It is possible to target just one site but it would seem like a blizzard to the webmaster. Normally a spider on a large run tends to randomise the URLs to be spidered so as not to put excessive load on the websites being spidered.
I heard about one airlines fare comparison site that was banned from an airline site for putting a high load on that site's servers. The data on the type of site is time sensitive and thus has to be frequently respidered.
It would then scan all sites 10'000'000 etc then return the results to dipsie, without us knowing about it. Maybe that's why we haven't seen Dipsie's spider in our stats logs.
I'd be very surprised if anyone has seen Dipsie's spider. It has all the appearance of vaporware - loads of buzzwords, piles of public relations/press releases and no results. :)