LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Ipvs 0.9.3 : panic on heavy load.

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: Ipvs 0.9.3 : panic on heavy load.
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Lionel Bringuier <lb@xxxxxxxxxxxxxxxxx>
Date: Fri, 30 Nov 2001 15:31:16 +0100
On ven, nov 30, 2001 at 02:35:04 +0200, Julian Anastasov wrote:
> > 1. On a single CPU machine, with a kernel compiled with SMP support, I get a
> > kernel freeze in mod_sltimer (ip_vs_timer.c). I get locked on a concurrent
> > write_lock/write_unlock(&__ip_vs_sltimerlist_lock) acces in mod_sltimer.
> > That problem disappears if I disable CONFIG_SMP (on a single CPU machine).
> > Notice that I did not reproduce that with a bi-CPU machine.
> 
>       Can you reproduce it with 0.9.7. It seems it will need fresh
> kernel.
I did not try yet. The validation process started some times ago and was
based on 2.4.5 (which all in all worked quite well), and I was very
suspicious about all the buzz about VM ans stability in recent kernels. I'll
give a try to 2.4.16, as it seems to be usable again.

> BTW, how you found that it is in mod_sltimer? 
With an old technique : I added a (dirty) function, which enables to have
characters printed directly in video memory (because I suspected that printk
was not as accurate as I expected) :

#include <asm/io.h>  
#define OFFSMAX (80*20) /* 20 lines of chars */ 
int ncxoffset;
static inline void printncx (const char c, char color) {
    int j;
    char * video_mem_v = phys_to_virt(0xb8000);
    video_mem_v += 2*(ncxoffset++));
        if (ncxoffset >= OFFSMAX) ncxoffset = 0;
    *(video_mem_v++)=c;
    *(video_mem_v++)=color;
    for (j=0; j<4; j++) *(video_mem_v+j) = '#'; /* where we are */
}

Then in ip_vs_timer.c :
void mod_sltimer(struct timer_list *timer, unsigned long expires)
{
    int ret;
  printncx('l','B'); /* B : green on red */
    write_lock(&__ip_vs_sltimerlist_lock);
  printncx('L','B');
    timer->expires = expires;
    ret = detach_sltimer(timer);
    internal_add_sltimer(timer);
  printncx('u','B');
    write_unlock(&__ip_vs_sltimerlist_lock);
  printncx('U','B');
}

And I could see l L u U l L u U l L u U l u (lock). I repeat, that happens
only on a UP machine with kernel configured as SMP.

> Can you find which ip_vs_conn_put call causes this problem?
No... how can I (easily ?).

>       I don't remember for problems with mod_sltimer fixed after 0.9.3.
> We have to find the problem with your help. Can you tell us the proto used
> (UDP?), the forwarding method?
Proto : UDP, forwarding method WRR (preferred) or WLC. I don't use others.


-- 
=========================================================
       Lionel Bringuier - lb_at_fr.netcentrex.net
      Team Leader - Linux Applications Development
Phone : +33 (0)2 31 46 35 70 - http://www.netcentrex.net


<Prev in Thread] Current Thread [Next in Thread>