| 9:45 pm on Dec 6, 2009 (gmt 0)|
Cuil's stealthy behavior also comes from 188.8.131.52
ramp1hq.cuil.com at Layer42 and have earned themselves a ban.
| 2:21 pm on Dec 7, 2009 (gmt 0)|
I've this thing requesting robots.txt and not proceeding any further (it wouldn't get in anyway) for more than a few days now.
This IP's make multiple requests and in no specific order, however quite close together.
| 10:45 pm on Dec 8, 2009 (gmt 0)|
Such behavior can happen in search engines when they are cleaning up a large list of URLs to eliminate those that have been disallowed. As this process is likely to be done in parallel it can manifest itself as described above.
| 3:09 am on Dec 9, 2009 (gmt 0)|
All requests on my sites look normal today -- and that's actually a new thing, because prior to last month, Twiceler apparently did not understand multi-user-agent policy records in robots.txt, and as a result didn't crawl the sites. That's changed now, and they're crawling away (at a normal rate).
Combined with Lord Majestic's speculation above and the "ramp" hosts with no UA, it wouldn't surprise me if they're preparing to roll out a new index some time soon.
| 1:47 am on Dec 18, 2009 (gmt 0)|
FWIW... Twiceler's still hammering away at the same site, every single day, usually in the late afternoon/early evening (Pacific). The next time I'm procrastinating something dreadful, I'll e-mail them about their overkill hits to robots.txt:
Do any of you ever get any traffic from them? I don't.
| 7:05 am on Dec 29, 2009 (gmt 0)|
Despite what Cuil's PR people would claim, in Irish, "Cuil" means fly or bug. Despite the claims of genius made about its founders Cuil is a pest and sends zero traffic on two of my sites. One of them is one of the largest Irish web directories and the other is a very large domain history and domain statistics website. They've been hammering away for months but normally when they start getting problematic they automatically get slapped with a 503. They've been 403ed on the web directory site for not following robots.
Lord Majestic's speculation is a possibility. The last I heard of Cuil was that it was trying some social search engine experiments and some Twitter stuff was being integrated.