Forum Moderators: open
This claims to be a site that scours Twitter for links posted in tweets. So how come it was crawling one of my sites? I don't know. I just know it won't be doing that again!
Host: 74.126.19.nn.static.a2webhosting.com
UA: http://twitturls.com
robots.txt? NO
referer log-spam? YES: http://twitturls.com
Twitturls-related UAs/Hosts also misbehave in other ways. Here's the above, erm, twit on the same July day, hitting the same file rapid-fire, regardless of six-plus 403s; but this time using the UA in the OP and below:
07/08 12:24:22
07/08 12:24:22
07/08 12:24:22
07/08 12:24:22
07/08 12:24:22
07/08 12:24:22
And from last May (exact same Host; ditto as far back as December, 2008):
74.126.19.nn.static.a2webhosting.com
Mozilla/5.0 (compatible; Twitturls; +http://twitturls.com)
robots.txt? NO
referer log-spam? YES: http://twitturls.com
Note: Twitturls is related to Twitturly and the latter UA misbehaves the same way, from no-robots to going where no Tweet has gone before to log-spamming for its site. E.g.:
UA: Twitturly / v0.5 (from: .algx.net)
UA: Twitturly / v0.6 (from: .amazonaws.com)
robots.txt? NO
referer log-spam? YES: http://twitturly.com
Yep. Been watching -- and blocking -- these guys for a while:)
Twitturly / v0.5
Twitturly / v0.6
@keyplyr: Thus far, I'd say almost ALL of the Twitter-related UAs I see either come from already-blocked hosts, typically server farms with long-standing histories of bad bot-running, or already-blocked bots.
Since the first of the year, I've found that 403'ing bots with "twit" in the UA doesn't affect real people from following the URLs mined from tweets by the bots. (YMMV) At first I tried a separate white list, but it got too unwieldy and time-consuming, and the overlaps too confusing.
And on the plus side, if I'm eyeballing logs when the Twitter bot pack attacks -- usually 10-15 Hosts and/or bots w/in 5 minutes to 1 filename; none with legit referers, of course -- I know someone's tweeted that page/link. Then it's easy enough to check search.twitter.com and see who said what.