Forum Moderators: DixonJones
White space = space, tab, or return. So yes...
aks the line immediately and for as long as you hold it down. This would leave no white space at all and is contrary to what one stroke of the 'Space Bar' does.
If I'm understanding you correctly, you're saying either of the above is the same thing? If that is the case, I'd have to disagree as each performs an entirely different function.
So, permit me to ask again, in order for the 'deny from(s)' to function as intended within my .htaccess file:
Does there need to be a 'White Space' (created by hitting the Space Bar) at the end of each IP Number entry?
Pendanticist
I have never had to add white space, and the documentation does not mention it.
May I suggest posting a "deny from" line and a line from the log showing it being violated? A "picture" is often worth a thousand words (or octets in this case). :)
Any pattern to the denies that are being violated? Are allow,deny in the right order?
Jim
Let me explain...
151.204.36.34 - - [28/Jan/2003:07:03:33 -0800] "GET /cgi-bin/formmail.pl HTTP/1.1" 200 - "http*//blahblah.com/" "-"
Where I've had deny from 151.204.36.34 - this entry does not issue a 403. (I ban formmail queries on a 'per instance' basis as I've not learned enough to do it other ways...yet.)
While looking for those deny from(s) not functioning properly I noticed this:
looksmart-sv-fw.looksmart.com - - [28/Jan/2003:07:20:31 -0800] "GET /1ABAw.html HTTP/1.1" 403 220 "-" "Mozilla/4.0 (compatible; grub-client-1.0.6; Crawl your own stuff with http*//grub.org)"
looksmart-sv-fw.looksmart.com - - [28/Jan/2003:07:20:31 -0800] "GET /1ABAw.html HTTP/1.1" 403 220 "-" "Mozilla/4.0 (compatible; grub-client-1.0.6; Crawl your own stuff with http*//grub.org)"
RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
RewriteCond %{REMOTE_HOST}!^looksmart
RewriteCond %{HTTP_USER_AGENT} grub-client
RewriteRule!^(403\.html¦robots\.txt)$ - [F]
(A method of allowing Ink's grub while banning all others.)
Apparently, this has ceased to function which, in turn, appears to have an overall impact on the effectiveness of my .htaccess file...including 'deny from(s)'.
A few minutes ago EmailSiphon visited (first time in many moons) and was no longer banned either.
Any pattern to the denies that are being violated?
Your question caused me to examine all 403s on a line-by-line basis, thereby finding that my true problem is a dysfunctional .htaccess file.
For the most part, all I was seeing, over and over again, were non-403'd requests dealing with formmail queries from repeat offenders. I suspect that tended to somewhat narrow my focus. :o
Are allow,deny in the right order?
Here's what I have:
# -FrontPage-IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Limit GET POST>
order deny,allow
deny from 12.42.212.67
#Deleted.
deny from 218.246.33.42
</Limit>
<Limit PUT DELETE>
SetEnvIf User-Agent ^Link keep_out
order deny,allow
deny from all
</Limit>
#Deleted for privacy.
#Deleted for privacy.
#Deleted for privacy.
ErrorDocument 404 /missing.html
# Send a permanent redirect from our old file to our new file
#Deleted for privacy.
#Deleted for privacy.
#Deleted for privacy.
RewriteEngine On
# RewriteCond %{HTTP_USER_AGENT} ^Mozilla* [OR]
RewriteCond %{HTTP_USER_AGENT} AaronCarter [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^almaden [OR]
RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^augurfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^BMCLIENT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^bumblebee [OR]
RewriteCond %{HTTP_USER_AGENT} copier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^CICC [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} dloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^dloader(NaverRobot)/1.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^DSurf15a [OR]
RewriteCond %{HTTP_USER_AGENT} ^DTS\ Agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} EasyDL [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^gazz/2.1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^gigabaz [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} grab [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} grub-client [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*httrack.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} hhjhj@yahoo.com [NC,OR]
RewriteCond %{HTTP_USER_AGENT} human-guided@lerly.net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} human-guided@mapfeatures.net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^httpdown [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InetURL:/1.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} Internet\ Explore\ 5\.x [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetLinkagent [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^Java/1.1 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Java/1.4.1_01 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Java1.4.0_01 [OR]
RewriteCond %{HTTP_USER_AGENT} Java1\.[0-9] [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JoBo/1.3 [OR]
RewriteCond %{HTTP_USER_AGENT} ^JBH*agent [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin2.6.2@unspecified.mail [OR]
RewriteCond %{HTTP_USER_AGENT} ^Lachesis [NC,OR]
RewriteCond %{HTTP_USER_AGENT} larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot/1.00 [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link*Sleuth [OR]
RewriteCond %{HTTP_USER_AGENT} LinkSweeper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkScan/11.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^MFC\_Tear\_Sample [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla*MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSProxy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^NPBot-1/2.0 [OR]
RewriteCond %{HTTP_USER_AGENT} oBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} offline [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^PingALink [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^QRVA [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} spider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Seeker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SecretBrowser/007 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^SSM\ Agent\ 1.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SURF [OR]
RewriteCond %{HTTP_USER_AGENT} Szukacz [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^webcollage/1.93 [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebHook [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMiner [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebMirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} Watchfire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond*%{HTTP_USER_AGENT}*^Whacker*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^Whizbang*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^Widow*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^Xaldon\ WebSpider*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^x-Tractor*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^XupiterToolbar*[OR]*
RewriteCond*%{HTTP_USER_AGENT}*^Zeus [OR]*
RewriteCond*%{REMOTE_HOST}!^looksmart*
RewriteCond*%{HTTP_USER_AGENT}*grub-client*
RewriteRule!^(403\.html¦robots\.txt)$*-*[F]*
When I tried adding these the other day, I got repeated 500 server errors:
RewriteCond %{HTTP_USER_AGENT} Web Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^bdcindexer_2.6.2\ research@bdc [OR]
I tried to massage them a couple of times finally giving up.
My apologies for misleading anyone by assuming my problem was solely a White Space 'deny from' issue.
Although, for the sake of argument we could expand the White Space issue to include the entire .htaccess file. Such to say, I know there needs/should to be one White Space = *(where 'one White Space' equals the *):
RewriteCond*%{HTTP_USER_AGENT}*^XupiterToolbar*[OR]*
"User-Agent"? [webmasterworld.com] was but one critical thinking approach I applied to better understand the schematics of both .htaccess and robots.txt as well as gaining a better understanding of White spaces [webmasterworld.com] and their impact.
Having said all that, I still have to wonder if, or how much bearing White Spaces have on the overall functionality of .htaccess and robots.txt aside from any core problems I may have within my own .htaccess file shown above.
As much as I dislike disjointed conversations inherent in forums (the lack of continuity), I have some appointments in a couple of hours and must get ready for them shortly.
Thank You :)
Pendanticist.