Re: [lvs-users] 2.3.36 performance

To: "Howard M. Kash (Civ, ARL/CISD)" <howard.kash@xxxxxxxxxxx>
Subject: Re: [lvs-users] 2.3.36 performance
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx, Julian Anastasov <ja@xxxxxx>
From: Simon Horman <horms@xxxxxxxxxxxx>
Date: Sat, 30 Oct 2010 11:32:12 +0900
On Fri, Oct 29, 2010 at 02:57:57PM -0400, Howard M. Kash (Civ, ARL/CISD) wrote:
> After upgrading from (with OPS patch) to 2.6.36, ksoftirqd
> process (eight of them) went from using 5-15% CPU each to using 15-40%
> CPU each.  Load average went from around 0.6 to around 2.  The server is
> round-robin load balancing about 19,000 UDP and 90 TCP DNS connection per
> second.  UDP uses OPS.  Broadcom NIC cards are using MSI.  With MSI
> disabled, load average is 0.11 and only one or two ksoftirqd process use
> <5% CPU.
> Could the nf_conntrack changes have caused this?  There were also many
> MSI and bnx2 updates in 2.6.36, so not sure if it's LVS or not.

Hi Howard,

Yes, it is very likely that the problem you are seeing
is a regression caused by the introduction of full-NAT.

There is a fix for this, which will be included in 2.6.37-rc1
but unfortunately it was to invasive to include in 2.6.36 as
the problem was noticed fairly late in the release cycle.

As I understand it, the fix that was made by the three patches
listed below.

These patches appear to apply cleanly on top of 2.6.36.
The v2.6.36-nfct branch of
is 2.6.36 plus these three patches.

I believe that even with these patches in order to avoid the performance
penalty you need to set /proc/sys/net/ipv4/vs/snat_reroute to 0.

commit 8a8030407f55a6aaedb51167c1a2383311fcd707
Author: Julian Anastasov <ja@xxxxxx>
Date:   Tue Sep 21 17:38:57 2010 +0200

    ipvs: make rerouting optional with snat_reroute
        Add new sysctl flag "snat_reroute". Recent kernels use
    ip_route_me_harder() to route LVS-NAT responses properly by
    VIP when there are multiple paths to client. But setups
    that do not have alternative default routes can skip this
    routing lookup by using snat_reroute=0.
    Signed-off-by: Julian Anastasov <ja@xxxxxx>
    Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>

commit f4bc17cdd205ebaa3807c2aa973719bb5ce6a5b2
Author: Julian Anastasov <ja@xxxxxx>
Date:   Tue Sep 21 17:35:41 2010 +0200

    ipvs: netfilter connection tracking changes
        Add more code to IPVS to work with Netfilter connection
    tracking and fix some problems.
    - Allow IPVS to be compiled without connection tracking as in
    2.6.35 and before. This can avoid keeping conntracks for all
    IPVS connections because this costs memory. ip_vs_ftp still
    depends on connection tracking and NAT as implemented for 2.6.36.
    - Add sysctl var "conntrack" to enable connection tracking for
    all IPVS connections. For loaded IPVS directors it needs
    tuning of nf_conntrack_max limit.
    - Add IP_VS_CONN_F_NFCT connection flag to request the connection
    to use connection tracking. This allows user space to provide this
    flag, for example, in dest->conn_flags. This can be useful to
    request connection tracking per real server instead of forcing it
    for all connections with the "conntrack" sysctl. This flag is
    set currently only by ip_vs_ftp and of course by "conntrack" sysctl.
    - Add ip_vs_nfct.c file to hold all connection tracking code,
    by this way main code should not depend of netfilter conntrack
    - Return back the ip_vs_post_routing handler as in 2.6.35 and use
    skb->ipvs_property=1 to allow IPVS to work without connection
    Connection tracking:
    - most of the code is already in 2.6.36-rc
    - alter conntrack reply tuple for LVS-NAT connections when first packet
    from client is forwarded and conntrack state is NEW or RELATED.
    Additionally, alter reply for RELATED connections from real server,
    again for packet in original direction.
    - add IP_VS_XMIT_TUNNEL to confirm conntrack (without altering
    reply) for LVS-TUN early because we want to call nf_reset. It is
    needed because we add IPIP header and the original conntrack
    should be preserved, not destroyed. The transmitted IPIP packets
    can reuse same conntrack, so we do not set skb->ipvs_property.
    - try to destroy conntrack when the IPVS connection is destroyed.
    It is not fatal if conntrack disappears before that, it depends
    on the used timers.
    Fix problems from long time:
    - add skb->ip_summed = CHECKSUM_NONE for the LVS-TUN transmitters
    Signed-off-by: Julian Anastasov <ja@xxxxxx>
    Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>

commit 3575792e005dc9994f15ae72c1c6f401d134177d
Author: Julian Anastasov <ja@xxxxxx>
Date:   Fri Sep 17 14:18:16 2010 +0200

    ipvs: extend connection flags to 32 bits
    - the sync protocol supports 16 bits only, so bits 0..15 should be
    used only for flags that should go to backup server, bits 16 and
    above should be allocated for flags not sent to backup.
    - use IP_VS_CONN_F_DEST_MASK as mask of connection flags in
    destination that can be changed by user space
    - allow IP_VS_CONN_F_ONE_PACKET to be set in destination
    Signed-off-by: Julian Anastasov <ja@xxxxxx>
    Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>

Please read the documentation before posting - it's available at: mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to

<Prev in Thread] Current Thread [Next in Thread>