Re: Ipvs 0.9.3 : panic on heavy load.

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: Ipvs 0.9.3 : panic on heavy load.
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Lionel Bringuier <lb@xxxxxxxxxxxxxxxxx>
Date: Fri, 30 Nov 2001 15:31:16 +0100
On ven, nov 30, 2001 at 02:35:04 +0200, Julian Anastasov wrote:
> > 1. On a single CPU machine, with a kernel compiled with SMP support, I get a
> > kernel freeze in mod_sltimer (ip_vs_timer.c). I get locked on a concurrent
> > write_lock/write_unlock(&__ip_vs_sltimerlist_lock) acces in mod_sltimer.
> > That problem disappears if I disable CONFIG_SMP (on a single CPU machine).
> > Notice that I did not reproduce that with a bi-CPU machine.
>       Can you reproduce it with 0.9.7. It seems it will need fresh
> kernel.
I did not try yet. The validation process started some times ago and was
based on 2.4.5 (which all in all worked quite well), and I was very
suspicious about all the buzz about VM ans stability in recent kernels. I'll
give a try to 2.4.16, as it seems to be usable again.

> BTW, how you found that it is in mod_sltimer? 
With an old technique : I added a (dirty) function, which enables to have
characters printed directly in video memory (because I suspected that printk
was not as accurate as I expected) :

#include <asm/io.h>  
#define OFFSMAX (80*20) /* 20 lines of chars */ 
int ncxoffset;
static inline void printncx (const char c, char color) {
    int j;
    char * video_mem_v = phys_to_virt(0xb8000);
    video_mem_v += 2*(ncxoffset++));
        if (ncxoffset >= OFFSMAX) ncxoffset = 0;
    for (j=0; j<4; j++) *(video_mem_v+j) = '#'; /* where we are */

Then in ip_vs_timer.c :
void mod_sltimer(struct timer_list *timer, unsigned long expires)
    int ret;
  printncx('l','B'); /* B : green on red */
    timer->expires = expires;
    ret = detach_sltimer(timer);

And I could see l L u U l L u U l L u U l u (lock). I repeat, that happens
only on a UP machine with kernel configured as SMP.

> Can you find which ip_vs_conn_put call causes this problem?
No... how can I (easily ?).

>       I don't remember for problems with mod_sltimer fixed after 0.9.3.
> We have to find the problem with your help. Can you tell us the proto used
> (UDP?), the forwarding method?
Proto : UDP, forwarding method WRR (preferred) or WLC. I don't use others.

       Lionel Bringuier -
      Team Leader - Linux Applications Development
Phone : +33 (0)2 31 46 35 70 -

<Prev in Thread] Current Thread [Next in Thread>