Forum Moderators: open
ROBOTS.TXT? No
Is this an example of someone hosted by SL crawling one of my sites? Nothing else seems to make sense to me. If so then I'm going to start blocking anything from SL except my servers.
One thing you need to be aware of is that some fairly large corporations use SL as their ISP, so if you want "corporate lunchtime shoppers and visitors," you may have to add some smaller-range exclusions to your larger blocked ranges. In these cases, there generally *is* a useful rDNS lookup on the corporation's range.
Jim
Seem to recall that there are SOME sub-net ranges of SL registered to others.
sub-net searches at ARIN are still possible, however they are more complicated (as well as more restrictive) than in previous years. (I've almost stopped attempting sub-net searches, although do have some saved from previous years; NONE for SL).
Anyway, the problem with these 'bandwidth providers' is that they don't seem to monitor the traffic except in gigabytes. They don't look at the HTTP level, which is why we have problems with requests coming from services like these.
Unfortunately, it'll probably get worse and worse as companies outsource their previously-in-house IT functions to service providers, and outsource their computing to 'the cloud.'
Frankly, I block big swaths myself, and then if I see click-throughs to my "More information on this (403) error" page, I investigate and unblock smaller chunks as needed. Some 'bots "click" on that link on my (terse) 403 error page, but they typically then don't fetch anything but the HTML of the "info" page. So it's fairly easy to take that, plus the source and nature of the original request that triggered the 403, plus the request headers, and make a call on bot/no-bot.
Anyway, it was just a heads-up that there might possibly be *some* legit traffic from SL, just like there is some legit traffic from Amazon Cloud -- a few mobile subscriber services.
Jim