Msg#: 4038054 posted 10:45 pm on Dec 8, 2009 (gmt 0)
Such behavior can happen in search engines when they are cleaning up a large list of URLs to eliminate those that have been disallowed. As this process is likely to be done in parallel it can manifest itself as described above.
Msg#: 4038054 posted 3:09 am on Dec 9, 2009 (gmt 0)
All requests on my sites look normal today -- and that's actually a new thing, because prior to last month, Twiceler apparently did not understand multi-user-agent policy records in robots.txt, and as a result didn't crawl the sites. That's changed now, and they're crawling away (at a normal rate).
Combined with Lord Majestic's speculation above and the "ramp" hosts with no UA, it wouldn't surprise me if they're preparing to roll out a new index some time soon.
Msg#: 4038054 posted 1:47 am on Dec 18, 2009 (gmt 0)
FWIW... Twiceler's still hammering away at the same site, every single day, usually in the late afternoon/early evening (Pacific). The next time I'm procrastinating something dreadful, I'll e-mail them about their overkill hits to robots.txt:
Msg#: 4038054 posted 7:05 am on Dec 29, 2009 (gmt 0)
Despite what Cuil's PR people would claim, in Irish, "Cuil" means fly or bug. Despite the claims of genius made about its founders Cuil is a pest and sends zero traffic on two of my sites. One of them is one of the largest Irish web directories and the other is a very large domain history and domain statistics website. They've been hammering away for months but normally when they start getting problematic they automatically get slapped with a 503. They've been 403ed on the web directory site for not following robots.
Lord Majestic's speculation is a possibility. The last I heard of Cuil was that it was trying some social search engine experiments and some Twitter stuff was being integrated.