Forum Moderators: open
LOG:
svext.nec-labs.com - - [30/Jun/2006:19:24:22 -0700]
"GET /robots.txt HTTP/1.1" 403 815 "-" "Java/1.5.0_04"
svext.nec-labs.com - - [30/Jun/2006:19:24:22 -0700]
"GET /dir1/file1.html HTTP/1.1" 302 223 "-" "Ken"
(Java is always 403'd. I've yet to figure out how to block no-dot UAs.)
NOTES:
From dnsstuff.com (excerpted):
IP address: 138.15.10.10
Reverse DNS: svext.nec-labs.com
Reverse DNS authenticity: [Verified]
NetRange: 138.15.0.0 - 138.15.255.255
CIDR: 138.15.0.0/16
NetName: NEC-LABORATORIES-AMERICA-INC
FYI:
NEC Laboratories America [nec-labs.com]
New bot Java/1.5.0_06 grabs all pages
[webmasterworld.com...]
Java/1.5.0_06 Spider Sighting, and Questions
[webmasterworld.com...]
But there's no Ken [google.com] bot info, and only one post for "nec-labs [google.com]" and it's totally unrelated. And re the other two, new GSA threads, the most 'current' GSA [google.com] posts are March, 2006, then 2003, and 2001. Yet you pulled up a thread from two days ago.
(Actually, I rarely find any of our current posts lately. It's been bugging me but not enough to file an official bug report. Yet. And just a few months ago, it felt like G cruised through here almost hourly! Probably outrageously bandwidth-costly but so convenient and extremely useful. Oh, well. I guess I'll just eyeball this (and each) forum's post titles back a bunch of pages.)
And the PlanetLabs info was definitely amusing. And interesting. And reinforcing of my decision years ago to err on the 403 side of darn near everything from anything-dot-planet-dot-anywhere, including:
Mozilla.*PlanetWeb
.planet.com
.speed.planet.nl
.theplanet.com
.earth.theplanet.net
.reverse.theplanet.net
.ipplanet.net
.planetarabia.com
(Alas,
SetEnvIfNoCase Remote_Host "planet" no_way probably is a bit extreme:) Remember how the very worst of the Host bunch was/is .reverse.theplanet.com? Gack. If I so much as see 'planet' in my logs, I start to twitch.
(Note to Self: Add .planet-lab.org)