Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time

To: " users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>, Julian Anastasov <ja@xxxxxx>, jslvs@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
From: JL <lvs@xxxxxxxx>
Date: Mon, 27 Sep 2010 13:08:57 +0100

The fwmarks are:
iptables -t mangle -A PREROUTING -i eth1 -d -p tcp -j MARK
--set-mark 101
iptables -t mangle -A PREROUTING -i eth1 -d -p udp -j MARK
--set-mark 102

  eth1 is the "external" (Internet-side) interface
  eth2 is where we receive redirected packets from the master is the load balanced IP
  The same rules are on the current master

As for the rest of your paragraph, about LOCAL_IN and OUTPUT things;
I'm afraid we have gone past my (and also John's) knowledge of the
netfilter code :(

We will give the patch a try.


Email from Julian Anastasov:


On Fri, 24 Sep 2010, John Sullivan wrote:

> I've been testing under single-CPU kvm instances, though we see exactly
> the same thing running on real multi-core hardware. There are two servers
> with essentially identical configurations. We're using keepalived to
> control load-balancing between the two, but again we do see the same
> problem if we configure each manually with "ip addr add" and "ipvsadm"
> directly. Each server is running 4 HTTP(S) servers: 2 Apache 2.2 HTTP,
> 1 Apache 2.2 HTTPS and a custom HTTP server, and we can see in the
> logfiles the regular keepalived probes against them all.

        It is interesting to know what exactly is marked
for the fwmark virtual service in backup server. Can it be
a problem where some packet is marked to hit fwmark service
and it is scheduled to real server with local IP address.
May be such local IP address is added after the real server
is added? If that is true we should add check to stop traffic
in transmitter methods if they pass packets via OUTPUT hook
that are destined to local IP address. May be such packets
come back to LOCAL_IN and create loop? It is possible because
the check for loopback device was removed.

> The first machine to boot up becomes the keepalived master, the second
> the backup. If no further action is taken things appear to function
> normally. ipvsadm reports normal running weights of both servers of
> around 250.

        I can explain why after removing the ipvsadm rules on
backup server you still get loop: the packets for existing
connections are not stopped, they know their real server
even after it is removed from rules. But I can not explain
why long sync packets trigger the problem but short sync
packets do not. Appended is a patch that drops traffic
to local addresses. Let me know if it changes something in
backup server.

Signed-off-by: Julian Anastasov <ja@xxxxxx>

diff -urp v2.6.35/linux/net/netfilter/ipvs/ip_vs_ctl.c
--- v2.6.35/linux/net/netfilter/ipvs/ip_vs_ctl.c        2010-05-17
10:49:01.000000000 +0300
+++ linux/net/netfilter/ipvs/ip_vs_ctl.c        2010-09-25 14:18:47.638354901 
@@ -942,6 +942,10 @@ ip_vs_add_dest(struct ip_vs_service *svc
                IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);

+               spin_lock(&dest->dst_lock);
+               ip_vs_dst_reset(dest);
+               spin_unlock(&dest->dst_lock);
                list_add(&dest->n_list, &svc->destinations);

@@ -1030,6 +1034,10 @@ ip_vs_edit_dest(struct ip_vs_service *sv
        /* Wait until all other svc users go away */
        IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);

+       spin_lock(&dest->dst_lock);
+       ip_vs_dst_reset(dest);
+       spin_unlock(&dest->dst_lock);
        /* call the update_service, because server weight may be changed */
        if (svc->scheduler->update_service)
diff -urp v2.6.35/linux/net/netfilter/ipvs/ip_vs_xmit.c
--- v2.6.35/linux/net/netfilter/ipvs/ip_vs_xmit.c       2010-08-02
09:37:49.000000000 +0300
+++ linux/net/netfilter/ipvs/ip_vs_xmit.c       2010-09-25 14:10:41.386369694 
@@ -113,6 +113,13 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp

+       if (rt && rt->rt_flags & RTCF_LOCAL) {
+               IP_VS_DBG_RL("Stopping traffic to local address, dest: %pI4\n",
+                            &rt->rt_dst);
+               ip_rt_put(rt);
+               return NULL;
+       }
        return rt;

Jarrod Lowe

Please read the documentation before posting - it's available at: mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to

<Prev in Thread] Current Thread [Next in Thread>