On Fri, 8 Jun 2001, Roberto Nibali wrote:
> LVS can use up all memory but I would be very surprised if it would crash
> the kernel.
time to go test it. I'll try with 16M memory and use Julian's test_lvs on
it. Won't be able to do it for a few days though.
> users != connections
>
> The persistency timeout defines the amount of time a template may reside in
> the masquerading table for lookups. The higher the timeout the longer we have
> to keep the incoming connections in the table the more will be in the table
> until the first templates expire.
agree, but the number of connections we can handle (for fixed memory size)
is the same whether they have a timeout of 60secs (TIME_WAIT) or 600 sec
for persistent connection. In the latter case we will have 1/10th the
number of clients connected that's all.
> > what's the security problem?
>
> SYN/RST flood.
OK
My patch will set the weight of the realserver to 0 in case the
> upper threshold is reached. But I do not test if the requesting traffic is
> malicious or not, so in case of SYN-flood it may be 99% of the packets causing
> the server to be taken out of service. In the end we have set all server to
> weight 0 and the load balancer is non-functional either. But you don't have
> the memory problem :)
I'ts non functional, but will recover when the flood stops. You won't have
people trying to figure out why the director crashed.
Some back of the envelope calculations to see how much of a problem we
have. Memory is cheap, let's say we have 128M spare for the hash table
(say the machine has 192M memory - it's reasonable to assume that in a
dedicated director that user space processes are going to be much less
than 64M).
Let's say we're on a 100Mbps link. Packets (SYN flood or valid) can arrive
at 8k packets/sec (lets say 10k/sec for round numbers). With 128M hash
table (== 10^6 conns), it will take 100sec to fill the hash table. In this
case taking 1sec snapshots seems to be often enough to track the memory
useage. If we have timeout for TIME_WAIT <100 secs, we can survive the
flood without any special action (eg the DoS defense strategies). If we
have TIME_WAIT=60secs, we can't fill the 128M with conn entries, no matter
what happens. If we have persistence set to 600secs, we'll be OOM in
100secs.
Do these numbers look OK?
Joe
--
Joseph Mack mack@xxxxxxxxxxx
|