Forum Moderators: phranque
Thank you all in advance for provide such valuable help, excuse for any inconvenience. Appreciate any help :)
Kind Regards,
Lumi
ps. i know that it´s not ok to post htaccess full, it is big. So i took off some IPs ( but i have lots of)
other thing i'd like to ask help is to have my own custom 403/404 error pages. And also to avoid the confusion with my custom 403 page being requested to banned people.
Obs. as i don´t know how to work with Php, i only have htmls on my site, so my error pages will be always html.
code: ( sorry being big )
RewriteEngine on
# -FrontPage-IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*<Limit GET POST>#The next line modified by DenyIPorder allow,deny#The next line modified by DenyIP#deny from all allow from all</Limit><Limit PUT DELETE>order deny,allow deny from all</Limit>AuthName simcredibledesigns.com AuthUserFile /home/simc/public_html/_vti_pvt/service.pwd
AuthGroupFile
/home/simc/public_html/_vti_pvt/service.grp
<Files 403.shtml>
order allow,deny
allow from
all
</Files>
deny from 82.128.214.104
deny from 210.153.217.137
Options All -Indexes
deny from 81.214.175.0
deny from 32.179.10.128
(more and more banned IPs here...)
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} !^http://([-a-z0-9]+\.)?simcredibledesigns\.com(/.*)?$ [NC]
RewriteRule ^(.*)\.(gif¦jpe?g¦png¦ico¦rar¦zip)$ /file.jpg?$1.$2 [NC]
# Forbid if blank (or "-") Referer *and* UA
RewriteCond %{HTTP_REFERER} ^-?$
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule .* - [F]
# Address harvesters
RewriteCond %{HTTP_USER_AGENT} ^(autoemailspider¦ExtractorPro) [NC,OR]
RewriteCond %{HTTP_USER_AGENT}
^E?Mail.?(Collect¦Harvest¦Magnet¦Reaper¦Siphon¦Sweeper¦Wolf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (DTS.?Agent¦Email.?Extrac) [NC,OR]
RewriteCond %{HTTP_REFERER} iaea\.org [NC,OR]
# Download managers
RewriteCond %{HTTP_USER_AGENT} ^(Alligator¦DA.?[0-9]¦DC\-Sakura¦Download.?(Demon¦Express¦Master¦Wonder)¦FileHound) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Flash¦Leech)Get [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Fresh¦Lightning¦Mass¦Real¦Smart¦Speed¦Star).?Download(er)? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Gamespy¦Go!Zilla¦iGetter¦JetCar¦Net(Ants¦Pumper)¦SiteSnagger¦Teleport.?Pro¦WebReaper) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(My)?GetRight [NC,OR]
# Image-grabbers
RewriteCond %{HTTP_USER_AGENT} ^(AcoiRobot¦FlickBot¦webcollage) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Express¦Mister¦Web).?(Web¦Pix¦Image).?(Pictures¦Collector)? [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Image.?(fetch¦Stripper¦Sucker) [NC,OR]
# "Gray-hats"
RewriteCond %{HTTP_USER_AGENT} ^(Atomz¦BlackWidow¦BlogBot¦EasyDL¦Marketwave¦Sqworm¦SurveyBot¦Webclipping\.com) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (girafa\.com¦gossamer\-threads\.com¦grub\-client¦Netcraft¦Nutch) [NC,OR]
# Site-grabbers
RewriteCond %{HTTP_USER_AGENT} ^(eCatch¦(Get¦Super)Bot¦Kapere¦HTTrack¦JOC¦Offline¦UtilMind¦Xaldon) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(Auto¦Cop¦dup¦Fetch¦Filter¦Gather¦Go¦Leach¦Mine¦Mirror¦Pix¦QL¦RACE¦Sauger) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.?(site.?(eXtractor¦Quester)¦Snake¦ster¦Strip¦Suck¦vac¦walk¦Whacker¦ZIP) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCapture [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Baiduspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Twiceler* [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NewsGatorOnline* [OR]
RewriteCond %{HTTP_USER_AGENT} Ask.Jeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^FAST-WebCrawl [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia\_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*wget* [OR]
RewriteCond %{HTTP_USER_AGENT} ^Hatena Antenna [OR]
RewriteCond %{HTTP_USER_AGENT} InfoSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teoma [OR]
RewriteCond %{HTTP_USER_AGENT} VoilaBot [OR]
RewriteCond %{HTTP_USER_AGENT} Inktomi [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !zyborg [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !webcrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !^Gigabot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !scrubby [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !ImageScape [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !^Mozilla\ 3\.01 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !CydralSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} !^Mozilla\ 3\.01 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*foxtorrent* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^MEGAUPLOAD [OR]
RewriteCond %{HTTP_USER_AGENT} Msnbot¦Slurp [NC,OR]
# Tools
RewriteCond %{HTTP_USER_AGENT} ^(curl¦Dart.?Communications¦Enfish¦htdig¦Java¦larbin) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (FrontPage¦Indy.?Library¦RPT\-HTTPClient) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(libwww¦lwp¦PHP¦Python¦www\.thatrobotsite\.com¦webbandit¦Wget¦Zeus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Microsoft¦MFC).(Data¦Internet¦URL¦WebDAV¦Foundation).(Access¦Explorer¦Control¦MiniRedir¦Class) [NC,OR]
# Unknown
RewriteCond %{HTTP_USER_AGENT} ^(Crawl_Application¦Lachesis¦Nutscrape) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^[CDEFPRS](Browse¦Eval¦Surf) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Demo¦Full.?Web¦Lite¦Production¦Franklin¦Missauga¦Missigua).?(Bot¦Locat) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (efp@gmx\.net¦hhjhj@yahoo\.com¦lerly\.net¦mapfeatures\.net¦metacarta\.com) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Industry¦Internet¦IUFW¦Lincoln¦Missouri¦Program).?(Program¦Explore¦Web¦State¦College¦Shareware) NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Mac¦Ram¦Educate¦WEP).?(Finder¦Search) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^(Moz+illa¦MSIE).?[0-9]?.?[0-9]?[0-9]?$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/[0-9]\.[0-9][0-9]?.\(compatible[\)\ ] [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NaverRobot [NC]
rewritecond %{REQUEST_URI} !(/$¦^$)
RewriteRule .* - [F,L]
IMO, and considering your new found understanding of htaccess?
Your simply attempting to accomplish too much too soon.
Copy and pasting multiple excerpts from various examples is not always a good practice. Many of the UA's your using are not even applicable today.
In addition, hopefully you understand the basic difference and/or applied theory of: "begins with"; "contains"; "ends with"?
In your RewriteCond lines.
I would also encourage you to NOT focus on precise Class d IP's:
EX:
#yours merely deny that exact range
deny from 81.214.175.0
#ask yourself if traffic from Turkey is beneficial to your site and whether the entire provider range may be denied. (a bad bot may simply return on 81.214.175.1 or alternatives.
#the providers entire range
deny from 81.214.0.0/16
deny from 81.214.128.0/17
I would encourage you to make small additions to your htaccess (verifying your websites function ability after each change.) And proceed over weeks and months as you grasp this new understanding of not only what others are doing, however what is beneficial or detrimental to your own site.
quote:
"In addition, hopefully you understand the basic difference and/or applied theory of: "begins with"; "contains"; "ends with"?
In your RewriteCond lines"
---*shame*...i think i don't understand it very well...
i read the modrewrite forum, all the intro part, but yet i was confused. I use translators most timeand sometimes it gets messy. I have neen trying to read carefully the instructions when copying these codes, because i really don't understand them correctly.
Sometimes i think i may ban innocent users, but i am so tired to pay extras for bandwidth, i have been abused on this for so long... at least seems that those automated downloads stopped. I used to have 13Gb in a day and things are calmer now.
All i dream is to allow people to nagivate and get downloads only from within my site and not from automated abusive programs. If a visitor uses such programs he/she doesn´t interest to my site anymore. That´s not elegant, i confess....but it is truth. I feel myself so tired to pay for being abused...
is there anything i can do to fix my outdated code parts?
thank you for patience :)
Take each user-agent string (or sub-string) specified in those RewriteConds, and search your monthly raw server access log file for tht user-agent. If that user-agent is not abusing your site, then remove the RewriteCond from your .htaccess file.
Basically, get rid of any lines that are not benefitting your site.
Jim
and about the turkey ip, yes...if it is not a bot, it still interests my site.
The IP range I provided will not ban the entire country of Turkey from your site.
Instead it was meant to offer you the possibly of expanding the ranges you deny.
In the long run, this kind of action, results in less maintenance and updating. Especially given, that all the harvester needs to do is simply disconnect and reconnect with a new range. While resuming his harvesting/downloading.
Jim's suggestion that you review your past months visitors User Agents with this out-dated list (modifying in the process), is something I suggest you begin your process of learning with.
Review the effect and make adjustments as you go along.
Currently your just overwhelmed with attempting too accomplish too much and too soon.
For the list of "bad user agents" like getright and the others, delete the ones that never visit your site, and keep the ones that you always see trying to waste your bandwidth. Some of those user agents are very old and are never used any more. It is not that the bad people are gone now, though. It is just that they have improved their scripts to use "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1..." In other words the old bad user agents now identify themselves as normal browsers to bypass old scripts like the ones you copied. :(
When dealing with user-agent strings, you have several options. You can specify complete (exact) user agent strings or partial user-agent strings. When specifying partial user-agent strings, you can specify that you want to match only the beginning, only the end, or neither.
So:
. ^bad-bot$ matches only a user agent string that is exactly "bad-bot"
. ^bad-bot matches any user-agent string that starts with "bad-bot"
. bad-bot$ matches any user-agent string that ends with "bad-bot"
and
. bad-bot matches any user-agent string that contains "bad-bot"
In this way, you can make your patterns more precise (when needed) to avoid blocking 'good' user-agents, and you can make them less precise in order to block more than one bad-bot with the same line of code.
For example,
RewriteCond %{HTTP_USER_AGENT} download [NC,OR]
RewriteCond %{HTTP_USER_AGENT} e-?mail [NC,OR]
For more information on regular expressions pattern matching, see the tutorial cited in our Apache Forum Charter.
Jim
There was (and possibly still is) a method of using "exactly as" with either the "begins", "ends" or "contains" options.
I used the procedure successfully for years, however with the various Apache updates, it works on some servers while NOT working on other servers.
By enclosing the phrase in quotes, it was unnecessary to escape either spaces or characters.
EX:
"the red house 1.0"
(which we would normally express as:
the\ red\ house\ 1\.0
What makes this controversial (beside working or not depending on the server) is that Apache Tutorials suggest that all Rewrite lines should/require be enclosed in "quotes", and nothing could be further from the truth.