Re: SYN floods and LVS-NAT CPU Load

To: Fabrice <fabrice@xxxxxxxxxx>
Subject: Re: SYN floods and LVS-NAT CPU Load
Cc: <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Tue, 11 Dec 2001 22:09:42 +0000 (GMT)

On Tue, 11 Dec 2001, Fabrice wrote:

> Hello,
> I found why my PIII 500 client couldn't send more than 2000 SYN/s :)
> It was because that client is also providing Internet connection to my
> LVS box (gateway) and used NAT for that. Some modules slowed the
> output rate down (one of the connection tracking module). I removed
> them and finally reached 60K SYN/s. With a mean of about 54K.
> That time the load on the LVS-NAT box was a lot higher (always 100%
> system usage, and a swap between the ttys takes about 3-5 seconds).
> That poor box couldn't handle the load and wasn't able to send back
> packets (maybe only 10 per seconds). This means that the DoS was
> successfull but it's only working during the flood, it won't brake any
> services (thanks to Syn_Cookies).

        If you want to measure a maximal possible rate use -srcnum 10
or another small number to avoid beating the routing cache in the
director. If you need to test the defense strategies you need large
value in -srcnum. The default is too small for this, it avoids errors.

> I think the only way to prevent the DoS in this case is to upgrade the
> LVS box hardware :)

        Not always. LVS does not protect the real servers. The
result can be the output pipe loaded from replies on DoS attack.
You should try some ingress rate limiting, independent from LVS.
Of course, your hardware should not be blocked from such attacks,
you need faster MB+CPU if you care for such problems.

> I looked with the vmstat 1 and 10, as Julian recommanded.
> Shouldn't the values of the number of interruptions with "vmstat 10" be
> 10 times more than "vmstat 1"'s?

        No, they should be equal, up to 5% are good, they show
that the process scheduling really works. If you are under attack
and you can't handle it then the snapshots from vmstat 1 are delayed
and the results differ too much from the results provided for
longer time interval.

> I got with vmstat 1: interrupts = ca. 400'000,  cpu sys = 100
> and with vmstat 10: interrupts = ca. 60'000, cpu sys = 100

        Your director reached its limits. You should try to flood
it with slower client(s). When you see that the input packet rate is
equal to the successfully forwarded packets (received on the real
server) then stop to slow down the attack. You reach the maximal
packet rate possible to deliver to the real servers. On NAT you
should consider the replies, they are not measured with testlvs
tests. They will need may be the same CPU power.

> Regards,
> Fabrice Bucher


Julian Anastasov <ja@xxxxxx>

<Prev in Thread] Current Thread [Next in Thread>