Forum Moderators: DixonJones

Message Too Old, No Replies

White Space Question

deny from...

         

pendanticist

11:41 am on Jan 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



deny from 00.000.000.000

I've noticed some of my deny from(s) don't work and I'm thinking it has something to do with whether there should be a trailing white space at the end of the IP Number.

Is that correct? Does there need to be a white space there?

Thanks.

Pendanticist.

Brett_Tabke

7:10 am on Jan 29, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



White space = space, tab, or return. So yes...

pendanticist

8:11 am on Jan 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



White space = space, tab, or return. So yes...

  • To me a 'White Space' is a single empty space created by the 'Space Bar'.

  • Usage of the 'Tab Key' either adds x-number of White Spaces in each line (as in a text editor for example) as though indenting, or when filling in forms moves the cursor to the next data entry point.

  • In the case of a 'Return' (Enter Key), I'm thinking when you hit the 'Enter Key' one time it bre

    aks the line immediately and for as long as you hold it down. This would leave no white space at all and is contrary to what one stroke of the 'Space Bar' does.

    If I'm understanding you correctly, you're saying either of the above is the same thing? If that is the case, I'd have to disagree as each performs an entirely different function.

    So, permit me to ask again, in order for the 'deny from(s)' to function as intended within my .htaccess file:

    Does there need to be a 'White Space' (created by hitting the Space Bar) at the end of each IP Number entry?

    Pendanticist

  • Brett_Tabke

    8:17 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



    Sure it isn't a line length issue instead? How long is your line?

    I'm not positive how apache parses that, or how big the buffer is for each line, but I've had to shorten and go multiline at times with "deny"s too.

    pendanticist

    8:20 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    My line is just as long as I've posted.

    EX:

    deny from 12.42.212.67
    deny from 12.111.68.234
    deny from 12.158.101.121
    deny from 12.164.76.253
    deny from 12.220.132.94
    deny from 12.246.249.48
    deny from 12.246.182.31

    jdMorgan

    8:23 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    pendanticist,

    I have never had to add white space, and the documentation does not mention it.

    May I suggest posting a "deny from" line and a line from the log showing it being violated? A "picture" is often worth a thousand words (or octets in this case). :)

    Any pattern to the denies that are being violated? Are allow,deny in the right order?

    Jim

    pendanticist

    8:24 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Gimme a few minutes to go dig around, Jim.

    Pendanticist.

    Key_Master

    8:25 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Also check and make sure your text editor isn't including hidden characters in the text (e.g. escape characters, etc.).

    pendanticist

    8:32 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Also check and make sure your text editor isn't including hidden characters in the text (e.g. escape characters, etc.).

    It doesn't.

    Still digging.

    pendanticist

    11:23 am on Jan 29, 2003 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Note: In examining my .htaccess file for supportive 'deny from' information relative to White Spaces, I think I've discovered the real cause regarding why those deny from(s) have not been catching individually banned IP Numbers.

    Let me explain...

    151.204.36.34 - - [28/Jan/2003:07:03:33 -0800] "GET /cgi-bin/formmail.pl HTTP/1.1" 200 - "http*//blahblah.com/" "-"

    Where I've had deny from 151.204.36.34 - this entry does not issue a 403. (I ban formmail queries on a 'per instance' basis as I've not learned enough to do it other ways...yet.)

    While looking for those deny from(s) not functioning properly I noticed this:

    looksmart-sv-fw.looksmart.com - - [28/Jan/2003:07:20:31 -0800] "GET /1ABAw.html HTTP/1.1" 403 220 "-" "Mozilla/4.0 (compatible; grub-client-1.0.6; Crawl your own stuff with http*//grub.org)"
    looksmart-sv-fw.looksmart.com - - [28/Jan/2003:07:20:31 -0800] "GET /1ABAw.html HTTP/1.1" 403 220 "-" "Mozilla/4.0 (compatible; grub-client-1.0.6; Crawl your own stuff with http*//grub.org)"

  • Thinking LS was still being 'allowed', I didn't notice it was being banned again until I began analysing my .htaccess file line-by-line to address the 'deny from' problem.

  • You may recall I posted a query regarding Inktomi's use of grub clients, where I was provided (in part) with this:

    RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
    RewriteCond %{REMOTE_HOST}!^looksmart
    RewriteCond %{HTTP_USER_AGENT} grub-client
    RewriteRule!^(403\.html¦robots\.txt)$ - [F]

    (A method of allowing Ink's grub while banning all others.)

    Apparently, this has ceased to function which, in turn, appears to have an overall impact on the effectiveness of my .htaccess file...including 'deny from(s)'.

    A few minutes ago EmailSiphon visited (first time in many moons) and was no longer banned either.

    Any pattern to the denies that are being violated?

    Your question caused me to examine all 403s on a line-by-line basis, thereby finding that my true problem is a dysfunctional .htaccess file.

    For the most part, all I was seeing, over and over again, were non-403'd requests dealing with formmail queries from repeat offenders. I suspect that tended to somewhat narrow my focus. :o

    Are allow,deny in the right order?

    Here's what I have:

    # -FrontPage-

    IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*

    <Limit GET POST>
    order deny,allow
    deny from 12.42.212.67
    #Deleted.
    deny from 218.246.33.42
    </Limit>
    <Limit PUT DELETE>
    SetEnvIf User-Agent ^Link keep_out
    order deny,allow
    deny from all
    </Limit>
    #Deleted for privacy.
    #Deleted for privacy.
    #Deleted for privacy.
    ErrorDocument 404 /missing.html
    # Send a permanent redirect from our old file to our new file
    #Deleted for privacy.
    #Deleted for privacy.
    #Deleted for privacy.
    RewriteEngine On
    # RewriteCond %{HTTP_USER_AGENT} ^Mozilla* [OR]
    RewriteCond %{HTTP_USER_AGENT} AaronCarter [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^almaden [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
    RewriteCond %{HTTP_USER_AGENT} ^augurfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^BMCLIENT [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
    RewriteCond %{HTTP_USER_AGENT} copier [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} dloader [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^dloader(NaverRobot)/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DTS\ Agent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} EasyDL [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} EmailSiphon [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^gazz/2.1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
    RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
    RewriteCond %{HTTP_USER_AGENT} grab [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} grub-client [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} hhjhj@yahoo.com [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} human-guided@lerly.net [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} human-guided@mapfeatures.net [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InetURL:/1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} Internet\ Explore\ 5\.x [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Java/1.1 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Java/1.4.1_01 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Java1.4.0_01 [OR]
    RewriteCond %{HTTP_USER_AGENT} Java1\.[0-9] [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^JoBo/1.3 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin2.6.2@unspecified.mail [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Lachesis [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LexiBot/1.00 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
    RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
    RewriteCond %{HTTP_USER_AGENT} LinkSweeper [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkScan/11.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MFC\_Tear\_Sample [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NPBot-1/2.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} oBot [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} offline [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
    RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} spider [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SecretBrowser/007 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SSM\ Agent\ 1.0 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SURF [OR]
    RewriteCond %{HTTP_USER_AGENT} Szukacz [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^webcollage/1.93 [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} Watchfire [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond*%{HTTP_USER_AGENT}*^Whacker*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^Whizbang*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^Widow*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^Xaldon\ WebSpider*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^x-Tractor*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^XupiterToolbar*[OR]*
    RewriteCond*%{HTTP_USER_AGENT}*^Zeus [OR]*
    RewriteCond*%{REMOTE_HOST}!^looksmart*
    RewriteCond*%{HTTP_USER_AGENT}*grub-client*
    RewriteRule!^(403\.html¦robots\.txt)$*-*[F]*

    When I tried adding these the other day, I got repeated 500 server errors:

    RewriteCond %{HTTP_USER_AGENT} Web Downloader [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^bdcindexer_2.6.2\ research@bdc [OR]

    I tried to massage them a couple of times finally giving up.

    My apologies for misleading anyone by assuming my problem was solely a White Space 'deny from' issue.

    Although, for the sake of argument we could expand the White Space issue to include the entire .htaccess file. Such to say, I know there needs/should to be one White Space = *(where 'one White Space' equals the *):

    RewriteCond*%{HTTP_USER_AGENT}*^XupiterToolbar*[OR]*

    "User-Agent"? [webmasterworld.com] was but one critical thinking approach I applied to better understand the schematics of both .htaccess and robots.txt as well as gaining a better understanding of White spaces [webmasterworld.com] and their impact.

    Having said all that, I still have to wonder if, or how much bearing White Spaces have on the overall functionality of .htaccess and robots.txt aside from any core problems I may have within my own .htaccess file shown above.

    As much as I dislike disjointed conversations inherent in forums (the lack of continuity), I have some appointments in a couple of hours and must get ready for them shortly.

    Thank You :)

    Pendanticist.

  •