Forum Moderators: open
Characteristics:
UA: Light Sense Inc HTTP Control.
Attempted to fetch /contact/ and /contact (without trailing slash) Funny, didn't try for contact.html ;)
Didn't look at robots.txt, of course.
BTW, the product page on this site also lists a matching 'spider mailer' mass email program.
What's the best way to deal with these bots? I think it's almost impossible to keep up with all the UAs.
What about disallowing access (403) to the page if a referer string is not given? This is OK if you don't want it spidered by a valid SE bot, but what about browsers that can turn off their referer string? Any other ideas?