Forum Moderators: open
They look like bandwidth sponges of the worst order to me. Direct indexing of mp3/multimedia on your site is not something that is generally beneficial to a site.
I'm new to spider identification so please be gentle if I'm wrong here. I banned asterias in my robots txt last week and yesterday it came back. This time it also showed up as Java1.3.0 Two requests as asterias and one as Java1.3.0 Is that also a spider and should I add Java1.3.0 to my robots txt? If yes to those two questions, what are the chances of a friendly spider also showing up as Java1.3.0?
63.251.10.136 - - [28/Feb/2001:15:55:58 -0500] "GET /robots.txt HTTP/1.1" 200 1582 "-" "asterias/2.0"
63.251.10.136 - - [28/Feb/2001:15:55:58 -0500] "GET / HTTP/1.1" 200 6543 "-" "Java1.3.0"
63.251.10.136 - - [28/Feb/2001:15:56:04 -0500] "GET / HTTP/1.1" 200 6524 "-" "asterias/2.0"
If you notice, all three visits from this spider come from the same IP address. You can ban visitors by IP address as well as UserAgent, so look into banning asterias that way instead. Here's [shat.net] some information about how to do that.
singingfish.com also gives an email address for asking them to exclude you from any future visits:
"If you wish to stop Asterias® from crawling your site, simply click here to send an email to our Operations Team (webmaster@singingfish.com). Please include the name of the site you wish to exclude."