Forum Moderators: open
The hits were all to HTML pages and not followed up by hits to stylesheets or images. 301 responses were not followed. I believe this to be a bot of some kind, but I have no idea what it's collecting pages for...
[edited by: volatilegx at 1:37 am (utc) on Aug. 2, 2005]
[edit reason] removed specifics [/edit]
Not sure how long that I've had the entire Class D denied.
[google.com...]
Using the search method that balam provided in Msg#10 of this thread:
[webmasterworld.com...]
I'm not sure how many customers they have, but it might be a considerable number. Then the impact would depend on whether the people using that filtering service fall into the demographic that your site addresses.
So, blocking them might be bad for your site, or it might not matter at all. The point is that every Webmaster should research the impact of blocking an IP range in-depth and determine the impact on their own site.
Jim
In some cases, they proxy all user requests, check the content, and either pass it through to the user or block it. This is fine for small services with a limited number of users where the bandwidth won't be too high.
In other cases, like this one, they function asynchronously to the user requests. They track users' requests, investigate the URLs the user requests, and then either whitelist or blacklist the sites. So, if you block the user-agent, you run the risk of blacklisting your own site.
In most cases, these services are used by corporate clients. But there are a few ISPs who offer this 'filtering' as a service as well.
I agree that the changing user-agents look dodgy, but in fact this is necessary to prevent 'bad' sites from cloaking by user-agent -- supplying innocent-looking pages to the filter, while provided their real content to users.
So, I advise caution before 'banning away,' especially if you see a lot of these requests. Think about your site's demographics, and whether it's likely that many of your visitors may be behind corporate proxies and filters, or whether your site might attract a group of people who would be likely to want to filter pages for their own or their children's use.
Jim