homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

This 33 message thread spans 2 pages: 33 ( [1] 2 > >     
Deny Access to a website
www.baidu.com Abuse !
mel the snowbird




msg:4527847
 6:02 pm on Dec 13, 2012 (gmt 0)

Hi:
I would like to 'Deny Access' to anyone coming to my home-based Apache 2.2.22 server via www.baidu.com

How do I do this please ?

(btw, I am having a continuous series of DoS attacks with the person on the other end of this who uses a wide variety of IP addresses when he/she uses this attack method. Currently, I 'Deny Access' to any IP addresses I note associatred with this 'baidu' attack -- but the attacker uses *many* different IP addresses, but *each* attack has 'http://www.baidu.com' in the access.log file)

Thank you.

Mel Smith
Mesa, Arizona

 

wilderness




msg:4527856
 6:32 pm on Dec 13, 2012 (gmt 0)

Just answered this 90-minutes ago in the thread immediately following yours.

Also gave you an example back in September.

mel the snowbird




msg:4527892
 8:20 pm on Dec 13, 2012 (gmt 0)

Hi Wilderness:

I understand very little of regex and nothing of htaccess, and I 'ran away' from your solution a few months ago.

What I *did do* was 'Deny All' to my affected web site, then Allow aaa.bbb.ccc.ddd to a selected few (about 20) people.

Now, this person(s) is using the 'baidu' method of attacking me. I've inserted a few lines of this below.

I've blocked *this* url but he keeps on coming, then starts again on a completely different IP address -- which I have to 'Deny' in addition because his line doesn't specify my *actual* web site.

Of course I have no file on my site that matches his attack URL (?)


This goes on for day-after-day :(

I'm still frightened of using htaccess in the httpd.conf file.

But, thanks for your help anyway.

-Mel Smith
Mesa, Arizona

*************************
54.242.122.129 - - [13/Dec/2012:11:44:02 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%2 etc.

54.242.122.129 - - [13/Dec/2012:11:44:08 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%2 etc.

54.242.122.129 - - [13/Dec/2012:11:44:14 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%2 etc

54.242.122.129 - - [13/Dec/2012:11:44:20 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%2 etc

... and the last of the example lines in full:

54.242.122.129 - - [13/Dec/2012:11:44:29 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%20golf%20hotel&pn=60&ie=utf-8&usm=1&rsv_page=1" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; HPNTDF)"

mel the snowbird




msg:4527917
 8:57 pm on Dec 13, 2012 (gmt 0)

Hi Wilderness:

I meant the Rewrite Rule stuff not htaccess :(

Oh well ...

-Mel

lucy24




msg:4527949
 10:24 pm on Dec 13, 2012 (gmt 0)

If you're starting with Deny from all, with Allow listed individually, how is anyone getting in without permission?

afaik, you could block the entire 54 sector without harm. Unless you're selling pharmaceuticals, maybe.

wilderness




msg:4527967
 12:11 am on Dec 14, 2012 (gmt 0)

I'm still frightened of using htaccess in the httpd.conf file.


The procedures within a httpd.conf file are different than for a standard htaccess file.
Others will need to advise you of those differences, as I know nothing of httpd.conf.

A solution is to use the SetEnvIfNoCase Referer (mod_setenvif; Apache)

In the log example you provided I see four potential words:
1) baidu
2) jasper
3) golf
4) hotel

You'll also need to set an environment variable (env=), then add that variable word you select onto the end of SetEnvIfNoCase Referer line.

FWIW, if you site has a vulnerability to allow access around your configuration via baidu refrerals, than the same vulnerability exists for other search engines as well.
Thus you need to determine a solution that works effectively.

lucy24




msg:4527983
 1:40 am on Dec 14, 2012 (gmt 0)

I'm still frightened of using htaccess in the httpd.conf file

Good, because it can't be done :) If you have your own server, you don't need htaccess at all. Anything that applies to the whole site goes in the config file. Anything that applies to specific directories also goes in the config file, but this time inside <Directory> envelopes. You must have a basic idea how this works, or you would never have got the server running in the first place.

Possibly when you say "htaccess" you really mean access restrictions, via whatever mod or combination of mods you currently use. The Allow and Deny directives are probably mod_authz_something-or-other. (If they're mod_access, you urgently need to upgrade your server.) This can be combined with mod_setenvif, which is what wilderness was referring to.

If you're afraid of mod_rewrite, mod_setenvif variables may be less intimidating. They often work in conjunction with your access module-- whichever one it happens to be-- so you can say things like

BrowserMatch Baidu go_away
BrowserMatch IckyNastyRobot go_away

and then add a line to your Deny section that says

Deny from env=goaway

... Except that something is still a little fishy, because if you've got a "Deny from all" in place, with only specific overrides, then you shouldn't need to do any of this.

mel the snowbird




msg:4527985
 1:57 am on Dec 14, 2012 (gmt 0)

Hi Lucy:

I have *four* active web sites. Each web site is 'operated' by a virtual host, and has *different* Deny/Allow commands.

My DoS attacker *used to* attack just one of the web sites incessantly by starting 50 or more massive downloads simultaneously trying to bring my server to its knees.

After I started the Deny All (then Allow 'some' to thwart him/her, he was foiled --- until recently:

Then he started this 'baidu-attack', and for the last week, I have been looking often at my log, and notice that he has at his disposal a whole bunch of IP addresses he uses to attack me. But in all the recent cases, I note the www.baidu.com stuff in my access log.

So, I look at the access.log and see only the *size* of the download, and see that somehow that Apache has selected a *different* one of my sites to 'serve' a page from. So, every few seconds, I get an attack to this 'other' web site. But first I had to identify *which* of my sites is being attacked. I have done this now by 'contextual' examination of the log. (i.e., noted the golf-related stuff in the log.

Since, I have Allow All in this newly attacked site, then I have to Deny From aaa.bbb.ccc.ddd *again* for each of his attempts.

So, my question then beconmes: How do I Deny www.baidu.com -- which is apparently a search engine based in 'downtown China' ?

I'll look again at the responses you and wilderness have given, and hope I can understand and try one of them

It would be nice to have a command like:

Deny from www.baidu.com

But I'll look at your suggested solutions, and if I feel comfortable that I understand it, I'll (tentatively) implement it.

Thanks again.

-Mel

wilderness




msg:4527987
 1:59 am on Dec 14, 2012 (gmt 0)

For the record, all the examples you provided were 403's and denied access.

Should you be confused with a 403 and the wish to NOT view the 403's, than refer to my duplicate explanation in the thread previous to yours.

mel the snowbird




msg:4527995
 3:41 am on Dec 14, 2012 (gmt 0)

Hi wilderness:

Yes, the examples I posted above showed how I denied access *after* I got a whole series of attacks with successful 200 result code and many 'Gets' of a big page (approx 960KB).

Even tonite, he's hitting with another 30 or so attacks on the IPs which I have already denied.

I assume, he's doing this programmatically, and hasn't yet realized that I have denied those set of IPs.

When he *does* realize that, he start up a whole *new* set of IPs and keep pounding me with them until I Deny each and every one of them.

So, yes, the examples show me Denying them, but tomorrow, I'll have to look again and re-deny ...

btw, the page size that I 'serve', and that he 'downloads' to his client browser is (as I said above) more than 900k bytes. So, even though its not a actual 'download', it does stress my Apache server.

My actual downloads in the previous site where I folied him are all greater than 14 Megs

Thanks,

-Mel

wilderness




msg:4527999
 4:18 am on Dec 14, 2012 (gmt 0)

As lucy said earlier, if you have to keep adding IP's when you have "Deny ALL" by default, that you have a misconfiguration.

wilderness




msg:4528002
 5:02 am on Dec 14, 2012 (gmt 0)

Since, I have Allow All in this newly attacked site, then I have to Deny From aaa.bbb.ccc.ddd *again* for each of his attempts.


Denying to specific Class D IP, rather than the hosts entire IP range is a bad practice.

A solution is to use the SetEnvIfNoCase Referer (mod_setenvif; Apache)

In the log example you provided I see four potential words:
1) baidu
2) jasper
3) golf
4) hotel

You'll also need to set an environment variable (env=), then add that variable word you select onto the end of SetEnvIfNoCase Referer line.

mel the snowbird




msg:4528104
 3:43 pm on Dec 14, 2012 (gmt 0)

Hi Wilderness:

Here is my Virtual Host directive for the site that is now being attacked.

What I would like is if you placed some code inside this VH that would stop *any* access from www.baidu.com.

I gave a small sample of the attacks that I undergo. However, the attacker uses different arguments in his attack. The only consistems one is that URL.

Also I note that many of the IPs he uses are part of the Amazon set of IPs. This confuses me. How does he *do* this ?

Anyway, here is the VH in question:

#****************************
<VirtualHost *:4297>
ServerName ww2.frostdelay.com:4297
ServerAlias frostdelay.com:4297
DocumentRoot "C:/Apache/cgi-bin/fdy"
<Directory "C:/Apache/cgi-bin/fdy">
Options ExecCGI Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
Deny from 74.93.230.92 91. 46.17. 77.234. 1.202. 31.214. 110.75. 109.230. 117.22. 193.151. 208.80.194.
Deny from 107.20. 107.22. 175.180. 175.181. 175.182.
Deny from 119.63. 123.125. 204.236. 204.56. 23. 50. 54. 67.202 184.73. 184.72. 107.21. 220.181.
Deny from 203.184.
AddOutputFilterByType DEFLATE text/html text/plain text/xml
AddOutputFilterByType DEFLATE text/css text/javascript
AddOutputFilterByType DEFLATE application/x-javascript
</Directory>
<IfModule alias_module>
ScriptAlias /cgi-bin/fdy/ "C:/Apache/cgi-bin/fdy/"
AddHandler cgi-script .exe
</IfModule>

DirectoryIndex fdyinit.exe index.html
</Virtualhost>

#*************************************

Bewenched




msg:4528129
 4:48 pm on Dec 14, 2012 (gmt 0)

I feel your pain, Baidu rarely gives us qualified leads but sure likes to eat up content. I've started blocking their China ranges, but have left their Japan ranges open... so far.

wilderness




msg:4528173
 6:05 pm on Dec 14, 2012 (gmt 0)

What I would like is if you placed some code inside this VH that would stop *any* access from www.baidu.com.


That's not how this forum works.

It's against forum charter to use domain names in syntax (please use example.com)

Search the Webmaster World archive (near top of every page) for SetEnvIfNoCase Referer

mel the snowbird




msg:4528234
 8:36 pm on Dec 14, 2012 (gmt 0)

Hi Wilderness:

O.K., I implemented your suggestion as follows:

If this works, I should see no more satisfied page requests from my 'badguy'

Thanks for the help !

-Mel Smith
Mesa, Arizona

******************************
SetEnvIfNoCase Referer "^http://www\.\example\.com/" badguy
<Directory "C:/Apache/cgi-bin/yyy">
Options ExecCGI Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
# this next is from advice of wilderness from webmasterworld
Deny from env=badguy
</Directory>

*********************************

wilderness




msg:4528240
 8:46 pm on Dec 14, 2012 (gmt 0)

Order allow,deny
SetEnvIfNoCase Referer baidu badguy
Allow from all
# this next is from advice of wilderness from webmasterworld
Deny from env=badguy


FWIW, You need to learn to keep your "deny from" IP's in some organized fashion (else you'll never locate anything).
Your IP's in ascneding order.
(Note there are two or three of these set you may combine (EX: 107.20. 107.21.)into one CIDR)
1.202. 107.20. 107.21. 107.22. 109.230. 110.75. 117.22. 119.63. 123.125.
175.180. 175.181. 175.182.
184.72. 184.73. 193.151.
203.184. 204.236. 204.56. 208.80.194. 220.181.23.
31.214. 46.17. 50. 54. 67.202 74.93.230.92 77.234. 91.

wilderness




msg:4528241
 8:47 pm on Dec 14, 2012 (gmt 0)

Order allow,deny
SetEnvIfNoCase Referer baidu badguy
Allow from all
# this next is from advice of wilderness from webmasterworld
Deny from env=badguy

FWIW, You need to learn to keep your "deny from" IP's in some organized fashion (else you'll never locate anything).
Your IP's in ascneding order.
(Note there are two or three of these set you may combine (EX: 107.20. 107.21.)into one CIDR)
1.202. 107.20. 107.21. 107.22. 109.230. 110.75. 117.22. 119.63. 123.125.
175.180. 175.181. 175.182.
184.72. 184.73. 193.151.
203.184. 204.236. 204.56. 208.80.194. 220.181.23.
31.214. 46.17. 50. 54. 67.202 74.93.230.92 77.234. 91.

mel the snowbird




msg:4528246
 9:01 pm on Dec 14, 2012 (gmt 0)

Hi Wilderness:

My 'organization' of nasty IPs is really as they occur. I know from their position where the latest badguy IPs are coming from, and I watch how/when they occur

Also, I don't know (yet) how to combine 107.20, and 107.21 into one CIDR ?

btw, you should know that I do my web site work as a public service, and do not want (or receive)any revenue from anyone for my work -- which consumes most of my day.

Thanks for the note and for pushing me to use the 'IfEnvIfNoCase' directive. I'm old and frightened and stubborn :)

-Mel Smith

wilderness




msg:4528247
 9:04 pm on Dec 14, 2012 (gmt 0)

Also, I don't know (yet) how to combine 107.20, and 107.21 into one CIDR ?


There are many FREE CIDR tools.
Google's your friend.

lucy24




msg:4528267
 11:16 pm on Dec 14, 2012 (gmt 0)

I don't know (yet) how to combine 107.20, and 107.21 into one CIDR

Start practicing. You'll internalize it very fast. 107.20.0.0/15 without even stopping to think ;)

wilderness




msg:4528269
 11:25 pm on Dec 14, 2012 (gmt 0)

You'll internalize it very fast.


Some may, however, I'm waiting for hell to freeze over before I grasp it ;)

mel the snowbird




msg:4530252
 8:09 pm on Dec 21, 2012 (gmt 0)

Hi Lucy24 and Wilderness:

I'm still undergoing the same attacks as noted in an earlier post. They have now intensified.

The IP is *always* from the range owned by Amazon, and the Referer is *always* from baidu.com

I understand that there was a set of serious attacks months ago that placed some big companies offline for awhile. Maybe *my* attacker learned from that and decided to practice on me ?

Anyway, I'll just have to keep Denying access to the nasties (probably one guy) and soldier on :((

btw, that 'setEnvIfNoCase' directive is not working, because the attacker came thru yesterday with a new IP (again from Amazon) that I had not previously denied, and *it* got thru* to my web site. This new attack was identical to all the others in Referer structure, and *should have* been denied by the env=badguy variable -- but was not ?

So, my directive:
Deny fron env=badguy

was *not* effective

So, I placed a Deny from aaa.bbb.ccc.ddd in my httpd.conf file to account for this new attack IP

-Mel Smith

lucy24




msg:4530270
 8:48 pm on Dec 21, 2012 (gmt 0)

SetEnvIfNoCase Referer "^http://www\.\example\.com/" badguy

Is that a typo or do you really have a \e in there? In some RegEx dialects, this is a specific character, so be careful. In mod_setenvif I like to strip things down to a minimum; the complicated stuff goes in mod_rewrite. In the case of a referer, all you need is

SetEnvIfNoCase Referer example badguy

Otherwise you'd be letting in anyone who gave the referer as, for example,
https://example.com

wilderness




msg:4530285
 10:10 pm on Dec 21, 2012 (gmt 0)

I'm still undergoing the same attacks as noted in an earlier post. They have now intensified.


btw, that 'setEnvIfNoCase' directive is not working, because the attacker came thru yesterday with a new IP (again from Amazon) that I had not previously denied, and *it* got thru* to my web site.


54.242.122.129 - - [13/Dec/2012:11:44:02 -0700] "GET / HTTP/1.1" 403 202 "http://www.baidu.com/s?wd=jasper%2 etc.


Is it possible that your confusing recording of the access attempt within your raw logs as actual access?

In your previous examples, this portion shows that access was NOT allowed, rather denied:

HTTP/1.1" 403

Or is the so-called attacker either using a different referring page and/or keyword, than the one you have previously specified?
Or even a blank refer?

Did you make the last modification I previously provided?
Changing example to badiu?

The best solution for these types of issues is to provide and example of the new log line, however you'll need to obscure your file/domain name and the referring domain name.

mel the snowbird




msg:4530317
 11:40 pm on Dec 21, 2012 (gmt 0)

Hi Wilderness & Lucy:

Here is the actual 'log' access where I've obscured the badguy with 'example':

72.44.53.2 - - [20/Dec/2012:11:46:42 -0700] "GET / HTTP/1.1" 200 962933 "http://www.example.com/s?wd=resort%20in%20charlevoix&pn=100&ie=utf-8&usm=1&rsv_page=1" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Sky Broadband; GTB6.6; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"

Altho he got thru to me with this attempt, you'll see below that I Denied him further access with a quick mod to my httpd.conf file below.

Here is my Virtual host definition where I've used 'mysite' instead of my actual site, and 'mydir' for the actual sub-directory of my CGI executable 'myscript.exe':

I'm not sure of the '^' hat character at the start of the regex string. Please take a look at the set up and the use of the badguy vrbl. I hope I've implemented it correctly.

Because I use a port number along with ww2. to access my site, I changed the Port Number to 9999

I note (with a *shock*) that I used a backslash immediately in front of the '\example.com' below. Why did I do this ?

Will look at it and correct it if unless I made a type here :(


Thanks to y'all
-Mel Smith
Mesa, Arizona


********* fragment of httpd.conf file below********
<VirtualHost *:9999>
ServerName ww2.mysite.com:9999
ServerAlias mysite.com:9999
DocumentRoot "C:/Apache/cgi-bin/mydir"
SetEnvIfNoCase Referer "^http://www\.\example\.com" badguy
<Directory "C:/Apache/cgi-bin/mydir">
Options ExecCGI Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
Deny from 74.93.230.92 91. 46.17. 77.234. 1.202. 31.214. 110.75. 109.230. 117.22. 193.151. 208.80.194.
Deny from 107.20. 107.22. 175.180. 175.181. 175.182.
Deny from 119.63. 123.125. 204.236. 204.56. 23. 50. 54. 67.202 184.73. 184.72. 107.21. 220.181.
Deny from 203.184.
# this next is from advice of webmasterworld and others. Hope it works
Deny from env=badguy
# This next one (IP: 72.44.53.2) sneaked by the badguy env vbl above
Deny from 72.44.
# This next IP is from the Russian Yandex Bot (95.108.151.244)which troubles me:
Deny from 95.108.
AddOutputFilterByType DEFLATE text/html text/plain text/xml
AddOutputFilterByType DEFLATE text/css text/javascript
AddOutputFilterByType DEFLATE application/x-javascript
</Directory>
<IfModule alias_module>
ScriptAlias /cgi-bin/mydir/ "C:/Apache/cgi-bin/mydir/"
AddHandler cgi-script .exe
</IfModule>

DirectoryIndex myscript.exe index.html
</Virtualhost>

guyinoz




msg:4532095
 11:54 pm on Dec 30, 2012 (gmt 0)

I've started blocking their China ranges, but have left their Japan ranges open... so far.


I've seen some USA IP's being used by baidu in the last few months, maybe to overcome those sites who have blocked access from china. Either way it's a shocking crawler which doesn't obey robots.txt and as keeps sucking huge amounts of bandwidth multiple times a day for no good reason.

mel the snowbird




msg:4532169
 3:47 pm on Dec 31, 2012 (gmt 0)

Hi guyinoz:
My attacker is not a robot or a crawler, it is simply some guy who attacks *me* in particular (after many months of subterfuge). For *now* he has ceased his attacks for the holidays. I'll wait and see what happens after the New Year.

-Mel Smith

mel the snowbird




msg:4532172
 4:06 pm on Dec 31, 2012 (gmt 0)

HI Guy from Oz:

I spoke too soon :((

I just checked my Apache Log (access.log), and there he was again (my baidu devil):

78 auccessive 'Gets' starting at 02:17am my time in Arizona thru 02:42am. He used seven different IP addresses. My Apache httpd.conf denied them all.

FYI, here are the IP addresses in order of occurrence:

54.242.46.147
54.245.152.187
23.20.44.157
54.234.2.16
54.245.31.190
204.56.96.115
50.16.45.23

Oh Well ...

-Mel Smith
Mesa, Arizona

wilderness




msg:4532177
 4:18 pm on Dec 31, 2012 (gmt 0)

54.242.46.147
54.245.152.187
23.20.44.157
54.234.2.16
54.245.31.190
50.16.45.23


All of these are AMAzon AWS ranges, See this thread [webmasterworld.com].

204.56.96.115


This one is a spammer.

This 33 message thread spans 2 pages: 33 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved