Forum Moderators: phranque

Message Too Old, No Replies

.htaccess ban list problems

firewalls

         

TheWebographer

5:50 pm on May 22, 2003 (gmt 0)

10+ Year Member



I have been using the following in .htaccess to help ban certain unwanted . However some people with a McAfee anit-virus and firewall are not able to view the website when I am using this. Any ideas as to why this would be so and what can be done to solve it?

Thank you in advance.

# source: [webmasterworld.com...]
# NC = any case
RewriteEngine On
RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR]
RewriteCond %{HTTP_USER_AGENT} bdfetch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} christcrawler [OR]
RewriteCond %{HTTP_USER_AGENT} cyberalert [OR]
RewriteCond %{HTTP_USER_AGENT} Custo [OR]
RewriteCond %{HTTP_USER_AGENT} cyberalert [OR]
RewriteCond %{HTTP_USER_AGENT} DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} GornKer [OR]
RewriteCond %{HTTP_USER_AGENT} GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} grub [OR]
RewriteCond %{HTTP_USER_AGENT} HMView [OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} Irvine [OR]
RewriteCond %{HTTP_USER_AGENT} JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} pcBrowser [OR]
#RewriteCond %{HTTP_USER_AGENT} puf [NC,OR]
#RewriteCond %{HTTP_USER_AGENT} RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} SearchExpress [OR]
RewriteCond %{HTTP_USER_AGENT} SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} WebBandit [OR]
RewriteCond %{HTTP_USER_AGENT} Webclipping [OR]
RewriteCond %{HTTP_USER_AGENT} WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} Webinator [OR]
RewriteCond %{HTTP_USER_AGENT} WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} Wget [OR]
RewriteCond %{HTTP_USER_AGENT} Widow [OR]
RewriteCond %{HTTP_USER_AGENT} WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} Xenu\ Link\ Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} Zeus [OR]
RewriteCond %{HTTP_USER_AGENT} ZyBorg

RewriteRule ^.* - [F,L]

TheWebographer

6:25 pm on May 22, 2003 (gmt 0)

10+ Year Member



Any ideas?

oilman

6:28 pm on May 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just to clarify - if you take the htaccess file down those other folks can see it but if you put it back it blocks them?

jdMorgan

6:43 pm on May 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



TheWebographer,

I don't see anything wrong with it, other than the fact that all of the ua strings are unanchored. This just makes it a lot less efficient, but shouldn't block the visitors you mention.

Is this the only RewriteRule-set you have? - I'm wondering if your visitors can't see your page at all, or if they just can't see your images and graphic elements. What error message do they get (if any)?

If you have an image-blocking section, make sure that you allow blank referrers - this is a common cause of visitors not seeing your graphics when using firewalls or security software.

A much better description of what problem symptoms your visitors see will likely result in a much better answer.

Jim

TheWebographer

7:08 pm on May 22, 2003 (gmt 0)

10+ Year Member



Yes, If I remove this portion from the .htaccess file then they are able to view the website. If I put it back in, they cannot.

I believe it is somehow caused by the site visitors firewall. But of course that may not be the case.

TheWebographer

8:22 pm on May 22, 2003 (gmt 0)

10+ Year Member



Any more ideas?

jdMorgan

8:33 pm on May 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



TheWebographer,

> A much better description of what problem symptoms your visitors see will likely result in a much better answer.

What do you see in your raw log files when one of these visitors tries to access the site? What, if anything appears in your error log file?

Do these visitors see anything, like the html of the page, but no images? Do you have any other RewriteRules in your .htaccess file?

Are you sure the problems are not being reported by someone who would like to download your site using one of the forbidden user-agents?

If you do not wish to answer these questions for some reason, then I suggest you comment out half of the RewriteConds in your list above, and test it. If the problem goes away, then that half of the list has an error in it. If the problem remains, then uncomment the first half, and comment out the second half. By successive "divide and conquer" iterations, you should be able to narrow down the problem.

I also suggest you review the original list you cited in your first post, and put the start anchors ("^" characters) back on the user-agent strings which had them in the original file. You may be matching a partial UA string that I can't spot.

HTH,
Jim

TheWebographer

12:57 pm on May 23, 2003 (gmt 0)

10+ Year Member



Thank you very much for your help.

The reason I suspect a firewall problem is that I had the same thing going on with preventing image hotlinking until I got some help modifying the anti-hotlinking .htaccess as below.

Note the section about firewall software and not being able to view the images:

##############################################
# image thievery
# Used instead of modrewrite because my host does not #currently support it

SetEnvIfNoCase Referer "^http://wwWebmasterWorldebsite.com/¦^http://mywebsite.com/" local_ref=1
# set the one below if firewall or ad prevention software is a concern. Otherwise they will not be able to see your images
SetEnvIfNoCase Referer "^$" local_ref=1
<FilesMatch "\.(gif¦jpe?g¦GIF¦JPE?G)">
Order Allow,Deny
Allow from env=local_ref
</FilesMatch>
##############################################

The people that called about not being able to see any of the website are running McAfee firewall software that comes with many Dell computers now. I doubt if they represent evil bots but rather are just regular users.

As soon as I remove the portion of .htaccess I listed in the first post they can view the website once again.

jdMorgan

3:03 pm on May 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Again, I suggest you review the original thread you cited, and add the start anchors back in where missing. If that does not work, then divide your code up by commenting out one-half of it at a time, and see which half is causing the trouble. Continue to divide it up, identifying the successively-smaller chunk which doesn't work.

You can comment out a line of code by adding a "#" symbol at the beginning of the line.

If you look in your logs, maybe you can find a logged visit from one of these blocked users, note the user-agent string used by McAfee, and figure out which line of your code is blocking it.

This is precisely how I would proceed. The code is syntactically correct, but one of the user-agent patterns is matching that used by your visitors. A possible cause is that the missing start anchors allow the pattern to match if it appears anywhere in the user-agent string received in the request.

Ref: Introduction to mod_rewrite [webmasterworld.com]

Jim

Maleville

3:40 pm on May 25, 2003 (gmt 0)

10+ Year Member



Hi, TheWebographer

In:
RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR]
What is the meaning of q=

Who is banned?

jdMorgan

6:54 pm on May 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maleville,

I'd guess it's an attempt to stop visitors coming from search engines after searching for "guestbook." The use of that search phrase - rather than one related to the site's contents - is a fairly good indication that the visitor is simply looking for guestbooks to exploit.

Jim

Maleville

12:10 am on May 26, 2003 (gmt 0)

10+ Year Member



Thank you jdMorgan.
I think that .htaccess file is more than a script : it's a philosophy.