homepage Welcome to WebmasterWorld Guest from 54.196.207.55
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
.htaccess deactivates CGI
.htaccess file causes CGI scripts not to work
Scooter24




msg:1503983
 10:39 am on Aug 2, 2002 (gmt 0)

More specifically when I call a cgi script in /cgi-bin I get a 403 error (permission denied). The cgi directory and the cgi scripts are set to 755. If I remove .htaccess cgi executes without problems.

This is my .htaccess file:
---------

<Files .htaccess>
order allow,deny
deny from all
</Files>

order allow,deny
deny from xx.xx.xx.xx
...
...
(several IP numbers and IP number blocks denied, of course not my provider; this rule works nicely)
...
...
allow from all

SetEnvIfNoCase User-Agent ".*Indy Library.*" getout
SetEnvIfNoCase User-Agent ^.*Demon getout
SetEnvIfNoCase User-Agent ^About getout
SetEnvIfNoCase User-Agent ^Active getout
SetEnvIfNoCase User-Agent ^AnswerChase getout
SetEnvIfNoCase User-Agent ^Ants getout
...
...
(several more agents denied here; this works nicely)
...
...

<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=getout
</Limit>

Options -Indexes

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mydomain.com/.*$ [NC]
RewriteRule \.(jpg¦JPG)$ [mydomain.com...] [R,L]

-----------

The last rewrite rule prevents image leeching. If I remove it I get a 500 Internal Server Error when calling a CGI script.

The .htaccess as it is now denies access to several internet providers (based on the IP number), to several agents and bad bots, doesn't show directory indexes and prevents image leeching. But it disables CGI scripts, although I am able to call a CGI script with SSI (a .shtml file containing the line <!--#exec cmd="./cgi-bin/mycgi.cgi" --> ).

What can I do ?

 

idiotgirl




msg:1503984
 11:08 am on Aug 2, 2002 (gmt 0)

Scooter-

I know that'll happen if mod_rewrite is not installed on your server. You can do a lot of SetEnvIfs without mod_rewrite installed. But when you go to RewriteConds without mod_rewrite installed - the thing blows up in your face and gives a 500 error instantly.

If mod_rewrite IS installed... well... I can't offer any suggestions as to a problem unless there's a part of your .htaccess (you left out all the IP's, etc.) with a glitch you aren't noticing. The longer those things get-the harder little mistakes are to see. I'm not a server guru so the possibilities for other reasons could be limitless.

Also, your order of:

order allow,deny
deny from xx.xx.xx.xx
...
...
(several IP numbers and IP number blocks denied, of course not my provider; this rule works nicely)
...
...
allow from all

might be reversed. You list all your deny's - but you have allow as first in the order. Prob'ly still works, but it's just one thing that looked out of place at first glance.

<added>And welcome to WebmasterWorld!</added>

idiotgirl




msg:1503985
 11:22 am on Aug 2, 2002 (gmt 0)

<!--#exec cmd="./cgi-bin/mycgi.cgi" -->

Where is your cgi-bin? Usually mine are all:

<!--#exec cmd="/cgi-bin/mycgi.cgi" -->

(no . before my first forward slash)

If it isn't already there - you might try adding:

SetEnvIf Referer ^$ local
SetEnvIfNoCase Referer ^(http://www\.¦http://)yourdomain\.com local

<Files ~ "\.(cgi¦pl)$">
order allow,deny
allow from env=local
deny from env=getout
</Files>

And see if that makes any difference whatsoever - just to test it.

Scooter24




msg:1503986
 3:36 pm on Aug 3, 2002 (gmt 0)

Well here is the complete .htaccess. I have removed the rewrite part at the end and get an internal server error when tring to execute a CGI script (www.mydomain.com/cgi-bin/mycgi.cgi).
What can I do ?

<Files .htaccess>
order allow,deny
deny from all
</Files>

SetEnvIfNoCase User-Agent ".*Indy Library.*" getout
SetEnvIfNoCase User-Agent ^.*Demon getout
SetEnvIfNoCase User-Agent ^About getout
SetEnvIfNoCase User-Agent ^Active getout
SetEnvIfNoCase User-Agent ^AnswerChase getout
SetEnvIfNoCase User-Agent ^Ants getout
SetEnvIfNoCase User-Agent ^Atom getout
SetEnvIfNoCase User-Agent ^attach getout
SetEnvIfNoCase User-Agent ^back getout
SetEnvIfNoCase User-Agent ^BatchFTP getout
SetEnvIfNoCase User-Agent ^bloodhound getout
SetEnvIfNoCase User-Agent ^brain getout
SetEnvIfNoCase User-Agent ^Buddy getout
SetEnvIfNoCase User-Agent ^Cartographer getout
SetEnvIfNoCase User-Agent ^CherryPicker getout
SetEnvIfNoCase User-Agent ^ChinaClaw getout
SetEnvIfNoCase User-Agent ^clickgarden getout
SetEnvIfNoCase User-Agent ^cosmos getout
SetEnvIfNoCase User-Agent ^Crawl_Application getout
SetEnvIfNoCase User-Agent ^Crawler getout
SetEnvIfNoCase User-Agent ^Crescent getout
SetEnvIfNoCase User-Agent ^CSHttpClient getout
SetEnvIfNoCase User-Agent ^curl getout
SetEnvIfNoCase User-Agent ^Custo getout
SetEnvIf User-Agent ^DA getout
SetEnvIfNoCase User-Agent ^DaviesBot getout
SetEnvIfNoCase User-Agent ^DISCo getout
SetEnvIfNoCase User-Agent ^DLExpert getout
SetEnvIfNoCase User-Agent ^dnloadmage getout
SetEnvIfNoCase User-Agent ^Drip getout
SetEnvIfNoCase User-Agent ^eCatch getout
SetEnvIfNoCase User-Agent ^Email getout
SetEnvIfNoCase User-Agent "^Express WebPictures" getout
SetEnvIfNoCase User-Agent ^Extractor getout
SetEnvIfNoCase User-Agent ^EyeNetIE getout
SetEnvIfNoCase User-Agent ^FileHound getout
SetEnvIfNoCase User-Agent ^FlashGet getout
SetEnvIfNoCase User-Agent ^flashsite getout
SetEnvIfNoCase User-Agent ^flunky getout
SetEnvIfNoCase User-Agent Frontpage getout
SetEnvIfNoCase User-Agent ^gazz getout
SetEnvIfNoCase User-Agent ^Genie getout
SetEnvIfNoCase User-Agent ^Get getout
SetEnvIfNoCase User-Agent ^Go!Zilla getout
SetEnvIfNoCase User-Agent ^Go-Ahead-Got-It getout
SetEnvIfNoCase User-Agent ^gotit getout
SetEnvIfNoCase User-Agent ^Grafula getout
SetEnvIfNoCase User-Agent ^gues getout
SetEnvIfNoCase User-Agent ^HMVie getout
SetEnvIfNoCase User-Agent ^htdig getout
SetEnvIfNoCase User-Agent ^ia_archiver getout
SetEnvIfNoCase User-Agent ^IBrowse getout
SetEnvIfNoCase User-Agent ^IncyWincy getout
SetEnvIfNoCase User-Agent ^ineta getout
SetEnvIfNoCase User-Agent ^infoGIST getout
SetEnvIfNoCase User-Agent ^InterGET getout
SetEnvIfNoCase User-Agent "^Internet Ninja" getout
SetEnvIfNoCase User-Agent ^IP?Works getout
SetEnvIfNoCase User-Agent ^Iria getout
SetEnvIfNoCase User-Agent ^iseeker getout
SetEnvIfNoCase User-Agent ^Jack getout
SetEnvIfNoCase User-Agent ^Java getout
SetEnvIfNoCase User-Agent ^JetCar getout
SetEnvIfNoCase User-Agent ^JoBo getout
SetEnvIfNoCase User-Agent ^JOC getout
SetEnvIfNoCase User-Agent ^JustView getout
SetEnvIfNoCase User-Agent ^larbin getout
SetEnvIfNoCase User-Agent ^leech getout
SetEnvIfNoCase User-Agent ^LexiBot getout
SetEnvIfNoCase User-Agent ^lftp getout
SetEnvIfNoCase User-Agent ^libW getout
SetEnvIfNoCase User-Agent ^Lifeboat getout
SetEnvIfNoCase User-Agent ^likse getout
SetEnvIfNoCase User-Agent ^Linkbot getout
SetEnvIfNoCase User-Agent "^links sql" getout
SetEnvIfNoCase User-Agent ^LncSoft* getout
SetEnvIfNoCase User-Agent ^Lockstep getout
SetEnvIfNoCase User-Agent ^lwp getout
SetEnvIfNoCase User-Agent ^Magnet getout
SetEnvIfNoCase User-Agent ^MARS getout
SetEnvIfNoCase User-Agent ^Marvin getout
SetEnvIfNoCase User-Agent ^Mass getout
SetEnvIfNoCase User-Agent ^Mata.*Hari.* getout
SetEnvIfNoCase User-Agent ^Memo getout
SetEnvIfNoCase User-Agent "^Microsoft URL" getout
SetEnvIfNoCase User-Agent ^MIDown getout
SetEnvIfNoCase User-Agent ^MIIxpc getout
SetEnvIfNoCase User-Agent ^MindSpider getout
SetEnvIfNoCase User-Agent ^Mirror getout
SetEnvIfNoCase User-Agent ^Mister getout
SetEnvIfNoCase User-Agent ^MOT-CF getout
SetEnvIfNoCase User-Agent ^Mozzila/4* getout
SetEnvIfNoCase User-Agent ^ms-catapult getout
SetEnvIfNoCase User-Agent ^msproxy getout
SetEnvIfNoCase User-Agent ^nabot getout
SetEnvIfNoCase User-Agent ^Navman getout
SetEnvIfNoCase User-Agent ^navroad getout
SetEnvIfNoCase User-Agent ^NearSite getout
SetEnvIfNoCase User-Agent ^Net getout
SetEnvIfNoCase User-Agent ^NICErsPRO getout
SetEnvIfNoCase User-Agent ^Nitro getout
SetEnvIfNoCase User-Agent ^oBot getout
SetEnvIfNoCase User-Agent ^Octopus getout
SetEnvIfNoCase User-Agent ^Papa getout
SetEnvIfNoCase User-Agent ^pc getout
SetEnvIfNoCase User-Agent ^PingALink getout
SetEnvIfNoCase User-Agent ^Pockey getout
SetEnvIfNoCase User-Agent ^polybo getout
SetEnvIfNoCase User-Agent ^psbot getout
SetEnvIfNoCase User-Agent ^Pump getout
SetEnvIfNoCase User-Agent ^Recorder getout
SetEnvIfNoCase User-Agent ^ReGet getout
SetEnvIfNoCase User-Agent ^RepoMonke getout
SetEnvIfNoCase User-Agent ^RMA getout
SetEnvIfNoCase User-Agent ^RPT-HTTPClient getout
SetEnvIfNoCase User-Agent ^Siphon getout
SetEnvIfNoCase User-Agent ^site getout
SetEnvIfNoCase User-Agent ^SlySearch getout
SetEnvIfNoCase User-Agent ^Smart getout
SetEnvIfNoCase User-Agent ^Snagger getout
SetEnvIfNoCase User-Agent ^Snake getout
SetEnvIfNoCase User-Agent ^SpaceBison getout
SetEnvIfNoCase User-Agent ^Sqworm getout
SetEnvIfNoCase User-Agent ^SuperBot getout
SetEnvIfNoCase User-Agent ^SuperHTTP getout
SetEnvIfNoCase User-Agent ^Surfairy getout
SetEnvIfNoCase User-Agent ^Surfbot getout
SetEnvIfNoCase User-Agent ^suzuran getout
SetEnvIfNoCase User-Agent ^Szukacz getout
SetEnvIfNoCase User-Agent ^tAkeOut getout
SetEnvIfNoCase User-Agent ^Tateji getout
SetEnvIfNoCase User-Agent ^Tcl getout
SetEnvIfNoCase User-Agent ^Telesoft getout
SetEnvIfNoCase User-Agent ^templeton getout
SetEnvIfNoCase User-Agent ^test getout
SetEnvIfNoCase User-Agent ^utopy getout
SetEnvIfNoCase User-Agent ^Vacuum getout
SetEnvIfNoCase User-Agent ^VoidEYE getout
SetEnvIfNoCase User-Agent ^Web getout
SetEnvIfNoCase User-Agent ^Wget getout
SetEnvIfNoCase User-Agent ^Whacker getout
SetEnvIfNoCase User-Agent ^WPF getout
SetEnvIfNoCase User-Agent ^wwwhoosh getout
SetEnvIfNoCase User-Agent ^Xaldon getout
SetEnvIfNoCase User-Agent ^xget getout
SetEnvIfNoCase User-Agent ^ZBot getout
SetEnvIfNoCase User-Agent ^Zeus getout
SetEnvIfNoCase User-Agent Bandit getout
SetEnvIfNoCase User-Agent Collector getout
SetEnvIfNoCase User-Agent Copier getout
SetEnvIfNoCase User-Agent Download getout
SetEnvIfNoCase User-Agent GetRight getout
SetEnvIfNoCase User-Agent grab getout
SetEnvIfNoCase User-Agent htmlgobble getout
SetEnvIfNoCase User-Agent HTTrack getout
SetEnvIf User-Agent iCab getout
SetEnvIfNoCase User-Agent MSIECrawler getout
SetEnvIfNoCase User-Agent naviscope getout
SetEnvIfNoCase User-Agent Ninja getout
SetEnvIfNoCase User-Agent Offline getout
SetEnvIfNoCase User-Agent peakjet getout
SetEnvIfNoCase User-Agent prozilla getout
SetEnvIfNoCase User-Agent rapidcache getout
SetEnvIfNoCase User-Agent realdownload getout
SetEnvIfNoCase User-Agent Reaper getout
SetEnvIfNoCase User-Agent robofox getout
SetEnvIfNoCase User-Agent saver getout
SetEnvIfNoCase User-Agent silentsurf getout
SetEnvIfNoCase User-Agent ^spiderbot getout
SetEnvIfNoCase User-Agent ^stamina getout
SetEnvIfNoCase User-Agent Stripper getout
SetEnvIfNoCase User-Agent Sucker getout
SetEnvIfNoCase User-Agent tarspider getout
SetEnvIfNoCase User-Agent Teleport getout
SetEnvIfNoCase User-Agent thumbnavigator getout
SetEnvIfNoCase User-Agent tivraspider getout
SetEnvIfNoCase User-Agent transsoft getout
SetEnvIfNoCase User-Agent udmsearch getout
SetEnvIfNoCase User-Agent utilmind getout
SetEnvIfNoCase User-Agent w3mir getout
SetEnvIfNoCase User-Agent weazel getout
SetEnvIfNoCase User-Agent Widow getout
SetEnvIfNoCase User-Agent www4mail getout
SetEnvIfNoCase User-Agent WWWOFFLE getout
SetEnvIfNoCase User-Agent voilabot getout

order allow,deny
allow from all
deny from 62.7
deny from 213.120.138
deny from 213.1
deny from 211.95.224
deny from 211.95.225
deny from 211.95.226
deny from 211.95.227
deny from 211.95.228
deny from 211.95.229
deny from 211.95.230
deny from 211.95.231
deny from 211.95.232
deny from 211.95.233
deny from 211.95.234
deny from 211.95.235
deny from 211.95.236
deny from 211.95.237
deny from 211.95.238
deny from 211.95.239
deny from 211.95.240
deny from 211.95.241
deny from 211.95.242
deny from 211.95.243
deny from 211.95.244
deny from 211.95.245
deny from 211.95.246
deny from 211.95.247
deny from 211.95.248
deny from 211.95.249
deny from 211.95.250
deny from 211.95.251
deny from 211.95.252
deny from 211.95.253
deny from 211.95.254
deny from 211.95.255
deny from 200.157.150.2
deny from 216.139.168
deny from 216.139.169
deny from 216.139.170
deny from 200.171.140.171
deny from 195.191
deny from 193.253.40
deny from 195.167.11
deny from 217.128.79
deny from 63.148.99
deny from env=getout

Options -Indexes

idiotgirl




msg:1503987
 8:37 pm on Aug 3, 2002 (gmt 0)

Scooter-

Do you have FrontPage extensions installed, by any chance?

Also, your some of your similar IP addresses can be compressed and written as:

SetEnvIf Remote_Addr ^211\.95\.22(0¦[2-9]) getout

(escape the periods with a backslash)

For things like:

SetEnvIfNoCase User-Agent ".*Indy Library.*" getout

That can just be:

SetEnvIfNoCase User-Agent "indy library" getout

Which will look for a group of those two words, together, case not sensitive, anywhere in the UA. No asterisks or periods required.

As you've included them, your deny-froms don't specify that they are considered getouts, and some servers require you put all your deny-from IP's on a single line when you do them the other way, as you've written them. (Maybe your actual .htaccess file shows them in a different position relative to your block of getouts?) These would all be grouped with your other SetEnvIf's at the top of the page.

Then, later, you can just add:

<Files ~ "^.*$">
order allow,deny
allow from all
deny from env=getout
</Files>

If <Directories> is used, instead, that can also cause blow-ups.

If you look through the WebmasterWorld forums, there was a really long and good thread about this with the proper structure and order of the elements you want to include in the robot identification threads - about a month or two ago? If I can find it I'll sticky you the URL

I know from blowing up enough Perl code (always!) if you start out simple, then add to what already works, line by line, you can more easily pinpoint exactly what-line-of-what finally caused it to crash. I'd get it as streamlined as possible inititally, then begin adding lines while you debug it.

Key_Master




msg:1503988
 5:32 pm on Aug 4, 2002 (gmt 0)

Scooter24,

I'm just curious, what criteria do you ban agents based on? Looking over your list, it appears (to me) to be patched together from different lists.

Did you know that:

SetEnvIfNoCase User-Agent ^About getout
bans the About.com link verifier

SetEnvIfNoCase User-Agent ^IBrowse getout
bans Amiga browsers

SetEnvIfNoCase User-Agent ^IncyWincy getout
bans the IncyWincy search engine spider

SetEnvIf User-Agent iCab getout
bans Macintosh browsers

SetEnvIfNoCase User-Agent tivraspider getout
bans the Tivra spider (not even online now)

SetEnvIfNoCase User-Agent voilabot getout
bans the Voila search engine spider

If you have video downloads on your site
SetEnvIf User-Agent ^DA getout
bans a certain type of video player.

There are a lot more I haven't touched upon. Many of those user agents aren't even used anymore.

Don't take this the wrong way, but I think there is a good chance you are banning yourself. Some ISPs (mostly proxies) change the IP and/or agent when the user clicks on a /cgi-bin/ link.

Scooter24




msg:1503989
 9:33 pm on Aug 5, 2002 (gmt 0)

About, ibrowse and tivraspider were in several .htaccess files I found in the Internet - so there must be something wrong with them. There was a thread about incywincy explaning why this is a bad spider. icab can be used to make an offline copy of a site. I caught voilabot doing something not allowed on my site, so I banned it.
I also searched download.com for offline browsing software and banned everything I found.
My site has valuable content, so I ban everything which might even remotely be used to make a copy of it. I also installed several robot traps and ban the providers of users who have tried to download my site.
Before I set up all these things my site (270MB) had been downloaded three or four times in one month. Since then there have been several attempts to download the site, all unsuccessful. Just one, max. two "innocent" users/day can't access my site because of this.
I'm not paranoid, it's just that the site has valuable content. I don't mind if people make copies of individual files, but not of the ENTIRE site.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved