LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: DoS protection strategies

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: DoS protection strategies
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Wed, 19 Apr 2006 10:42:09 +0200
Hello,

> The mod_python and mod_php applications currently under my care are at
> 38-44MB resident on 64-bit.  On a minimal 64-bit box, I'm seeing 6MB
> resident.

These seem to be my findings as well (contrary to what I stated
earlier), after logging into various high volume web servers of our
customers. In fact, I quickly set up an apache2 with some modules and
this is the result:

vmware-test:~# chroot /var/apache2 /apache2/sbin/apachectl -l
Compiled in modules:
  core.c
  mod_deflate.c
  mod_ssl.c
  prefork.c
  http_core.c
  mod_so.c
vmware-test:~# grep ^LoadModule /var/apache2/etc/apache2/apache2.conf
LoadModule php5_module modules/libphp5.so
LoadModule access_module modules/mod_access.so
LoadModule dir_module modules/mod_dir.so
LoadModule fastcgi_module modules/mod_fastcgi.so
LoadModule log_config_module modules/mod_log_config.so
LoadModule mime_module modules/mod_mime.so
LoadModule perl_module modules/mod_perl.so
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule setenvif_module modules/mod_setenvif.so
vmware-test:~# ps -U wwwrun -u wwwrun -o pid,user,args,rss,size,vsz
  PID USER     COMMAND                       RSS    SZ    VSZ
 8761 wwwrun   /apache2/sbin/fcgi-pm -f /e 11660  2964  17144
 8762 wwwrun   /apache2/sbin/apache2 -f /e 11764  3096  17332
 8763 wwwrun   /apache2/sbin/apache2 -f /e 11760  3096  17332
 8764 wwwrun   /apache2/sbin/apache2 -f /e 11760  3096  17332
 8765 wwwrun   /apache2/sbin/apache2 -f /e 11760  3096  17332
 8766 wwwrun   /apache2/sbin/apache2 -f /e 11760  3096  17332

If I disable everything non-important except mod_php, I get following:

vmware-test:~# ps -U wwwrun -u wwwrun -o pid,user,args,rss,size,vsz
  PID USER     COMMAND                       RSS    SZ    VSZ
 9088 wwwrun   /apache2/sbin/fcgi-pm -f /e  9768  2304  15004
 9089 wwwrun   /apache2/sbin/apache2 -f /e  9856  2304  15060
 9090 wwwrun   /apache2/sbin/apache2 -f /e  9852  2304  15060
 9091 wwwrun   /apache2/sbin/apache2 -f /e  9852  2304  15060
 9092 wwwrun   /apache2/sbin/apache2 -f /e  9852  2304  15060
 9093 wwwrun   /apache2/sbin/apache2 -f /e  9852  2304  15060

A bare apache2 which only serves static content (not stripped or
optimized) yields:

vmware-test:~# ps -U wwwrun -u wwwrun -o pid,user,args,rss,size,vsz
  PID USER     COMMAND                       RSS    SZ    VSZ
 9191 wwwrun   /apache2/sbin/apache2 -f /e  2588  1364   5376
 9192 wwwrun   /apache2/sbin/apache2 -f /e  2584  1364   5376
 9193 wwwrun   /apache2/sbin/apache2 -f /e  2584  1364   5376
 9194 wwwrun   /apache2/sbin/apache2 -f /e  2584  1364   5376
 9195 wwwrun   /apache2/sbin/apache2 -f /e  2584  1364   5376

However, COW does not occur for carefully designed application logic
with shared data. So, normally even 40 rss does not hurt you. Stripping
both perl and python to a minimal set of functionality helps further.

>  I've honestly never seen an application, either CGI- or
> mod-based, use less than 2MB on 32-bit including the CGI, and most in
> the 15-45MB range.

I checked with various customers' CMS installations based on CGIs and
they range between 1.8MB and 11MB RSS. Again, this does not hurt so long
 as the thread model is enabled. However, one has to be cautious
regarding the thread pool settings and for the application handler
(perl, python, ...) within the thread model of apache or else resource
starvation or locking issues bite you in the butt. For Perl I believe
the thread-related settings are:

   PerlInterpStart    <ThreadLimit/4>
   PerlInterpMax      <ThreadLimit/3*2>
   PerlInterpMaxSpare <ThreadLimit/2>

Which however heavily interferes with the underlying apache threading
model. If you only use LWPs (pre-2.6 kernel time) those settings are
better not used or you get COW behaviour of the perl thread pool. For
NPTL based setups, this yield much reduced memory constraints. I cannot
post customer data for obvious reasons.

>  As you say, I think the application is a huge
> variable, but therein lies the weakness of the process model.

I believe that 3 simple design techniques help reduce this weakness:

1. Design your web service with shared sources in mind
2. Use caches and ram disks for your storage
3. Optimise your OS (most people don't know this)

The last point sounds trivial but I've seen people running web servers
on SuSE or RedHat using a preemtive Kernel, NAPI and runlevel 5 with KDE
or Gnome, d/i-notify and power management on!

Basic debugging with valgrind, vmstat, slabtop could have showed them
immediately why there was I/O starvation, memory pressure and heavy
network latency.

>> Or you set your timeouts correctly, or you implement proper state
>> mapping using a reverse proxy and a cookie engine.
> 
> Timeouts certainly help, but that's somewhat akin to saying that if you
> set your synrecv timeout low enough, the DoS won't hurt you. :)

I didn't actually mean TCP timeouts, but KeepAlive timeouts for example.

> KeepAlives by their nature will increase the simultanous connection
> count, but I apologize if I came across as advocating turning them off
> as a knee-jerk fix to connection-count problems.

Generally yes and no, you didn't come across as that.

> Whether they're beneficial or not (for scalability reasons) depends on
> whether you bottleneck on CPU or RAM first, and whether you're willing
> to scale wider to keep the behavior change due to keepalives.

I don't buy the CPU bottleneck for web service applications. Yes, I have
seen 36-CPU nodes go down to their knees by simply invoking a Servlet
directly, but after fixing that application and moving to a multi-tier
load balanced environment things smoothed down quite a bit. My
experience is that CPU constraints for web services are a result of bad
software engineering. Excellent technology and frameworks are available,
people sometimes just don't know how to use them ;).

For RAM, I'd have to agree that this is always a weakness in the system,
but I reckon that a sane IT budget to implement and map your business
into an e-business web service is certainly considering high enough
expenses in buying hardware, including enough (fast & reliable) RAM.

>>> Netfilter has various matching modules that can limit connections
>>> from and/or to specific IPs, for example:
>>> iptables --table filter --append VIP_ACCEPT --match dstlimit
>>> --dstlimit 666 --dstlimit-mode srcipdstip-dstport --dstlimit-name
>>> VIP_LIMIT --jump ACCEPT
>>
>> No wonder you have no memory left on your box :).
> 
> :)  The rule was just an example, and 666 is my numeric "foo".

Fair enough :).

Joke aside, you should seriously consider giving advice regarding
installing iptables/netfilter stuff on high-volume networked machines.
At least make sure you do not load the ip_conntrack module, or you're
running out of memory in no time. I've seen badly configured web servers
which had the ip_conntrack module loaded (collecting every incoming
connection tuple and setting a timeout of a couple of hours) running out
of memory within hours. The customer before this fix rebooted his box 3
times a day per cronjob ... go figure.

>> I would not call dropping a certain amount of illegitimate _and_
>> legitimate connections to be too useful when you're running on a
>> strict SLA. QoS approaches based on window scaling help a bit more.
> 
> Agreed, I was just mentioning them, not advocating them.  drop_packet
> and secure_tcp, set to 1, seem decent choices.  If LVS is out of RAM, I
> think your SLA is doomed, only to be perhaps aided by these settings. 
> Having them on all the time is indeed Bad.

Honestly, in my 8+ years of LVS development and usage, I have never seen
an LVS box run out of memory. I'd very much like to see such a site :).

>> Regarding throttling, I reckon you use the TC framework available in
>> the Linux kernel since long before netfilter hit the tables.
> 
> Good point, I had forgotten about TC, though I'm not sure it can
> throttle *connections* vs *throughput*.

With the (not so very well documented) action classifier and the u32
filter plus a few classes you should get there.

>>> , plus a more closely integrated and maintained health checking system.
>>
>> How would you improve it? Suggestions are always welcome.
> 
> I had to completely rewrite the LVS alert module for mon, in addition to
> tweaking several of the other mon modules.  Now, this was on a
> so-last-year 2.4 distro -- I haven't worked with LVS under 2.6 yet or
> more modern mon installs. I also wrote a simple CLI interface wrapper
> to ipvsadm, since editing the ipvsadm rules file isn't terribly
> operator-friendly and prone to error for simple host/service disables.

Hmm, how does this CLI interface wrapper look like? And what is an
ipvsadm rules file? How does it look like? Did you try available
alternatives like keepalive or ldirectord? Maybe you did, but I couldn't
read it from the above passage.

> I think all the parts are there for a Unix admin to complete an
> install.  But for a health-checking, stateful-failover,
> user-friendly-interface setup, it's pretty piecemeal.

Agreed, efforts could be made into simplifying the user space part.
Since there are already many solutions to this issue, I wonder where to
begin best?

>  And there's no L7
> to my knowledge. 

ktcpvs is a start, not much tested in the wild though I believe. OTOH
putting my load balancer consultancy hat on, I rarely see L7 load
balancing needs, except maybe cookie based load balancing. I would very
much like to see a simple, working and proper VRRP implementation or
integration into LVS. This is what gives hardware load balancers USPs.

> There are some commercial alternatives (that will
> appeal to some admins for these reasons) that are likely inferior
> overall to LVS.  I think the work lies most in integration, both of the
> documentation and testing, and perhaps patch integration.

100% agreed!

> Just my soapbox -- I'm using LVS under heavy load right now so I'm
> certainly not complaining.

Never thought so :).

>>> And source-routing support. ;)
>>
>> Which you did :).
> 
> Eh, not really -- I just accidentally reiterated the same problem and
> same patch that someone posted a year or two ago. :)

Solved is solved, that's what counts.

> The primary parts of the systems I alluded to are the attack
> fingerprints and flood detection that intelligently blocks bad traffic,
> not good traffic.  Nothing is 100%, but in terms of intelligent,
> low-false-positive malicious request / flood blocking, they do extremely
> well at blocking the bad stuff and passing the good stuff.

Absolutely. We spend a considerable amount of time doing consultant work
in banking or government environments (after all, what else is there in
Switzerland :)), and there is the tendency to zero-tolerance regarding
false-positives in blocking. Trying to explain why this happens
nevertheless is sort of difficult at times.

> Is it worth the bank that they charge, or the added points of failure? 
> Depends on how big your company is I suppose.  But I know of no direct
> OSS alternative -- or I'd use it!

Fair enough.

Best regards and thanks for the interesting conversation,
Roberto Nibali, ratz
-- 
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com             fax://++41 62 823 9356
-------------------------------------------------------------
10 Jahre Kompetenz in IT-Sicherheit.              1996 - 2006
Wir sichern Ihren Erfolg.                      terreActive AG
-------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>