Forum Moderators: open
77.100.245.xx - - [22/Feb/2012:03:50:44 +0000] "GET /robots.txt HTTP/1.1" 200 6265 "-" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
cpc18-nmal16-2-0-custxx.19-2.cable.virginmedia.com - - [22/Feb/2012:03:50:44 +0000] "GET / HTTP/1.1" 403 1343 "-" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
77.100.245.xx - - [22/Feb/2012:03:51:48 +0000] "GET /robots.txt HTTP/1.1" 200 6265 "-" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
cpc18-nmal16-2-0-custxx.19-2.cable.virginmedia.com - - [22/Feb/2012:03:51:49 +0000] "GET / HTTP/1.1" 403 1343 "-" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
cpc18-nmal16-2-0-custxx.19-2.cable.virginmedia.com - - [22/Feb/2012:03:52:26 +0000] "GET /gallery/thumb_001.jpg HTTP/1.1" 403 1412 "-" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} httrack [NC]
RewriteRule !^robots\.txt$ - [F]
I was personally assured in an email by the user of the IP address that they neither knew what HTTrack was or had attempted to use it.
May not be a lie. Could be proxy or dhcp assigned.
The downloader had that IP one day. The next day it was used by honest Joe.
#Block Virgin Media IP if U-A is HTTrack
RewriteCond %{REMOTE_ADDR} ^77\.(9[6-9]|1(0[0-3]))\.
RewriteCond %{HTTP_USER_AGENT} httrack [NC]
RewriteRule .* - [F]
RewriteCond %{REMOTE_ADDR} ^77\.(9[6-9]|1(0[0-3]))\.
RewriteCond %{HTTP_USER_AGENT} httrack [NC]
RewriteRule . 403.php [L]
... ^77\.(9[6-9]|1(0[0-3]))\.
...
Change to:
RewriteCond %{REMOTE_ADDR} ^77\.(9[6-9]|10[0-3])\.
RewriteRule . 403.php [L]
Would it be happy again if you said
^77\.(9([6-9])|1(0[0-3]))\.
Next "why" question: Why are you doing it this way instead of
RewriteRule . - [F]
I've always believed that a plain-Jane 403 served by the visitors own machine is most appropriate
In addition, many of the shared hosts will automatically insert a custom 403 with advertising for their services
I've always believed that a plain-Jane 403 served by the visitors own machine is most appropriate
But the 403 isn't served by the visitor's browser.
Not Found
The requested URL {made-up name} was not found on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator, webmaster@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.
More information about this error may be available in the server error log.
Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.
Although most error messages can be overriden [sic], there are certain circumstances where the internal messages are used regardless of the setting of ErrorDocument. In particular, if a malformed request is detected, normal request processing will be immediately halted and the internal error message returned. This is necessary to guard against security problems caused by bad requests.
where does the 403 come from on a site that that neither offers a custom 403 or host doesn't present an advertising 403?
Perhaps I failed to specify that.