| 12:53 pm on Dec 3, 2009 (gmt 0)|
Question is, does it obey robots.txt rules? Guess not, huh?
| 2:11 pm on Dec 3, 2009 (gmt 0)|
My apologies, I forgot my usual robots snippet.
READ ROBOTS.TXT? No
OBEYED ROBOTS.TXT? No
| 5:56 pm on Dec 3, 2009 (gmt 0)|
afaik ClamAV (virus scan for linux) does not offer to check websites for viruses. Seems to be a name-highjack. Sad,...ClamAV is great.
| 10:30 pm on Dec 3, 2009 (gmt 0)|
Clamscan's command line includes a switch "mail-follow-urls" to "Download and scan URLs". I suspect that's it. As far as I know it's not enabled by default, although it may augment scans for phishing URLs.
It would need someone a bit clued up to set this as clam is poor on user interfaces.
| 4:09 am on Dec 4, 2009 (gmt 0)|
Gary, you'd be spared a whole lot of bad, baaad hits -- from real and faked UAs -- if you whitelisted instead of blacklisted;)
| 2:54 pm on Dec 4, 2009 (gmt 0)|
Because of my browser project I need to see how UAs behave so I usually let everything in and then only ban if the behavior is excessively egregious, like the Bing bots. It's because one of the things I do is recommend which UAs are ban-worthy. This version of Clam is now banned. But if a new one comes calling it'll be able to crawl until it does something bad enough to warrant my intervention.