Forum Moderators: phranque
Thank you in advance.
# source: [webmasterworld.com...]
# NC = any case
RewriteEngine On
RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR]
RewriteCond %{HTTP_USER_AGENT} bdfetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} christcrawler [OR]
RewriteCond %{HTTP_USER_AGENT} cyberalert [OR]
RewriteCond %{HTTP_USER_AGENT} Custo [OR]
RewriteCond %{HTTP_USER_AGENT} cyberalert [OR]
RewriteCond %{HTTP_USER_AGENT} DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} GornKer [OR]
RewriteCond %{HTTP_USER_AGENT} GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} grub [OR]
RewriteCond %{HTTP_USER_AGENT} HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} Irvine [OR]
RewriteCond %{HTTP_USER_AGENT} JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} pcBrowser [OR]
#RewriteCond %{HTTP_USER_AGENT} puf [NC,OR]
#RewriteCond %{HTTP_USER_AGENT} RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} SearchExpress [OR]
RewriteCond %{HTTP_USER_AGENT} SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} WebBandit [OR]
RewriteCond %{HTTP_USER_AGENT} Webclipping [OR]
RewriteCond %{HTTP_USER_AGENT} WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} Webinator [OR]
RewriteCond %{HTTP_USER_AGENT} WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} Wget [OR]
RewriteCond %{HTTP_USER_AGENT} Widow [OR]
RewriteCond %{HTTP_USER_AGENT} WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} Xenu\ Link\ Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} Zeus [OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg
RewriteRule ^.* - [F,L]
I don't see anything wrong with it, other than the fact that all of the ua strings are unanchored. This just makes it a lot less efficient, but shouldn't block the visitors you mention.
Is this the only RewriteRule-set you have? - I'm wondering if your visitors can't see your page at all, or if they just can't see your images and graphic elements. What error message do they get (if any)?
If you have an image-blocking section, make sure that you allow blank referrers - this is a common cause of visitors not seeing your graphics when using firewalls or security software.
A much better description of what problem symptoms your visitors see will likely result in a much better answer.
Jim
> A much better description of what problem symptoms your visitors see will likely result in a much better answer.
What do you see in your raw log files when one of these visitors tries to access the site? What, if anything appears in your error log file?
Do these visitors see anything, like the html of the page, but no images? Do you have any other RewriteRules in your .htaccess file?
Are you sure the problems are not being reported by someone who would like to download your site using one of the forbidden user-agents?
If you do not wish to answer these questions for some reason, then I suggest you comment out half of the RewriteConds in your list above, and test it. If the problem goes away, then that half of the list has an error in it. If the problem remains, then uncomment the first half, and comment out the second half. By successive "divide and conquer" iterations, you should be able to narrow down the problem.
I also suggest you review the original list you cited in your first post, and put the start anchors ("^" characters) back on the user-agent strings which had them in the original file. You may be matching a partial UA string that I can't spot.
HTH,
Jim
The reason I suspect a firewall problem is that I had the same thing going on with preventing image hotlinking until I got some help modifying the anti-hotlinking .htaccess as below.
Note the section about firewall software and not being able to view the images:
##############################################
# image thievery
# Used instead of modrewrite because my host does not #currently support it
SetEnvIfNoCase Referer "^http://wwWebmasterWorldebsite.com/¦^http://mywebsite.com/" local_ref=1
# set the one below if firewall or ad prevention software is a concern. Otherwise they will not be able to see your images
SetEnvIfNoCase Referer "^$" local_ref=1
<FilesMatch "\.(gif¦jpe?g¦GIF¦JPE?G)">
Order Allow,Deny
Allow from env=local_ref
</FilesMatch>
##############################################
The people that called about not being able to see any of the website are running McAfee firewall software that comes with many Dell computers now. I doubt if they represent evil bots but rather are just regular users.
As soon as I remove the portion of .htaccess I listed in the first post they can view the website once again.
You can comment out a line of code by adding a "#" symbol at the beginning of the line.
If you look in your logs, maybe you can find a logged visit from one of these blocked users, note the user-agent string used by McAfee, and figure out which line of your code is blocking it.
This is precisely how I would proceed. The code is syntactically correct, but one of the user-agent patterns is matching that used by your visitors. A possible cause is that the missing start anchors allow the pattern to match if it appears anywhere in the user-agent string received in the request.
Ref: Introduction to mod_rewrite [webmasterworld.com]
Jim