LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: ipvs_syncmaster brings cpu to 100%

To: Nishanth Aravamudan <nacc@xxxxxxxxxx>
Subject: Re: ipvs_syncmaster brings cpu to 100%
Cc: Wensong Zhang <wensong@xxxxxxxxxxxx>
Cc: Julian Anastasov <ja@xxxxxx>
Cc: netdev@xxxxxxxxxxx
Cc: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Cc: Dave Miller <davem@xxxxxxxxxxxxx>
From: Horms <horms@xxxxxxxxxxxx>
Date: Mon, 26 Sep 2005 17:05:10 +0900
On Sun, Sep 25, 2005 at 09:34:00PM -0700, Nishanth Aravamudan wrote:
> On 26.09.2005 [12:28:08 +0900], Horms wrote:
> > On Fri, Sep 23, 2005 at 11:15:31AM -0400, Roger Tsang wrote:
> > > As I've said before in this thread, you might want to try changing all the
> > > ssleep() calls to schedule_timeout().
> > > 
> > > Roger
> > > 
> > > 
> > > On 9/22/05, Luca Maranzano <liuk001@xxxxxxxxx> wrote:
> > > >
> > > > Hello all,
> > > >
> > > > here again trying to discover the reason ot the CPU hog for
> > > > ipvs_sync{master,backup}.
> > > >
> > > > I've digged in the sources for ip_vs_sync.c and the main differences
> > > > between kernel 2.6.8 and 2.6.12 is the use of ssleep() instead of
> > > > schedule_timeout().
> > > >
> > > > The oddity I've seen is that in the header of both files, the version
> > > > is always like this:
> > > >
> > > > * Version: $Id: ip_vs_sync.c,v 1.13 2003/06/08 09:31:19 wensong Exp $
> > > > *
> > > > * Authors: Wensong Zhang <wensong@xxxxxxxxxxxxxxxxxxxxxx>
> > > >
> > > > Is Wensong still the maintainer for this code?
> > 
> > Yes, although he is kind of quiet.
> > 
> > > > Furthermore, if I make an "rgrep" in the source tree of kernel 2.6.12
> > > > the function schedule_timeout() is more used than the ssleep() (517
> > > > occurrencies vs. 43), so why in ip_vs_sync.c there was this change?
> > > >
> > > > The other oddity is that Horms reported on this list that on non Xeon
> > > > CPU the same version of kernel of mine does not present the problem.
> > > >
> > > > I'm getting crazy :-)
> > 
> > I've prepared a patch, which reverts the change which was introduced
> > by Nishanth Aravamudan in February.
> 
> Was the 100% cpu utilization only occurring on Xeon processors?

That seems to be the only case where were this problem has been
observed. I don't have such a processor myself, so I haven't actually
been able to produce the problem locally.

One reason I posted this issue to netdev was to get some more
eyes on the problem as it is puzzling to say the least.

> Care to try to use msleep_interruptible() instead of ssleep(), as
> opposed to schedule_timeout()?

I will send a version that does that shortly, Luca, can
you plase check that too?

> In your patch, you do not need to set the state back to TASK_RUNNING,
> btw.

Thanks, updated patch below.

-- 
Horms

Use schedule_timeout() instead of ssleep() in ip_vs_sync daemon,
as the latter seems to cause 100% CPU utilistaion on HT Xeons.

Discussion:
http://archive.linuxvirtualserver.org/html/lvs-users/2005-09/msg00031.html

Reverts:
http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=commit;h=f8afb60c7537130448cc479d6d8dc9bf4ee06027

Signed-off-by: Horms <horms@xxxxxxxxxxxx>

diff --git a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c
--- a/net/ipv4/ipvs/ip_vs_sync.c
+++ b/net/ipv4/ipvs/ip_vs_sync.c
@@ -655,7 +655,8 @@ static void sync_master_loop(void)
                if (stop_master_sync)
                        break;
 
-               ssleep(1);
+               __set_current_state(TASK_INTERRUPTIBLE);
+               schedule_timeout(HZ);
        }
 
        /* clean up the sync_buff queue */
@@ -712,7 +713,8 @@ static void sync_backup_loop(void)
                if (stop_backup_sync)
                        break;
 
-               ssleep(1);
+               __set_current_state(TASK_INTERRUPTIBLE);
+               schedule_timeout(HZ);
        }
 
        /* release the sending multicast socket */
@@ -824,7 +826,8 @@ static int fork_sync_thread(void *startu
        if ((pid = kernel_thread(sync_thread, startup, 0)) < 0) {
                IP_VS_ERR("could not create sync_thread due to %d... "
                          "retrying.\n", pid);
-               ssleep(1);
+               __set_current_state(TASK_INTERRUPTIBLE);
+               schedule_timeout(HZ);
                goto repeat;
        }
 
@@ -858,7 +861,8 @@ int start_sync_thread(int state, char *m
        if ((pid = kernel_thread(fork_sync_thread, &startup, 0)) < 0) {
                IP_VS_ERR("could not create fork_sync_thread due to %d... "
                          "retrying.\n", pid);
-               ssleep(1);
+               __set_current_state(TASK_INTERRUPTIBLE);
+               schedule_timeout(HZ);
                goto repeat;
        }
 

<Prev in Thread] Current Thread [Next in Thread>