LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Ipvs 0.9.3 : panic on heavy load.

To: Lionel Bringuier <lb@xxxxxxxxxxxxxxxxx>
Subject: Re: Ipvs 0.9.3 : panic on heavy load.
Cc: <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Fri, 30 Nov 2001 14:35:04 +0200 (EET)
        Hello,

On Fri, 30 Nov 2001, Lionel Bringuier wrote:

> Hello.
>
> I am using ipvs 0.9.3 over a kernel 2.4.5 (on Intel Red Hat 7.1). The
> following issues are experienced on a VoIP H323 load-balancing system, under
> heavy load (40+ non-RAS H323 calls/sec, 50 simultaneous, for those who can
> understand this jargon ;).
>
> I have notices two problems, the second of which I cannot solve.
>
> 1. On a single CPU machine, with a kernel compiled with SMP support, I get a
> kernel freeze in mod_sltimer (ip_vs_timer.c). I get locked on a concurrent
> write_lock/write_unlock(&__ip_vs_sltimerlist_lock) acces in mod_sltimer.
> That problem disappears if I disable CONFIG_SMP (on a single CPU machine).
> Notice that I did not reproduce that with a bi-CPU machine.

        Can you reproduce it with 0.9.7. It seems it will need fresh
kernel. BTW, how you found that it is in mod_sltimer? Can you find
which ip_vs_conn_put call causes this problem?

> 2. I have some systematic kernel panics after 10000 to 80000 successful
> calls. After some investigation, it appeared that the crash occurs due to
> something in the sltimer_handler (ip_vs_timer.c).  I can say that the crash
> is appearing in the call for "fn(data)" in run_sltimer_list, which seems to
> be standing for a call for 'ip_vs_conn_expire'.  The fix I used is to
> comment all the inside of sltimer_handler function "void
> sltimer_handler(unsigned long data) { }". I can get millions of calls
> succeeding this way. However, I have no garbage collecting of lost ip_vs
> connection, which is 'a little' embarassing ;)
>
> Has anyone any clue about that... I'm working on that for weeks now, and I
> feel desperatly lost.

        I don't remember for problems with mod_sltimer fixed after 0.9.3.
We have to find the problem with your help. Can you tell us the proto used
(UDP?), the forwarding method?

> (After a coarse ksyms/System.map analysis, my 'fine' debuging was made with
> traces on screen directly using the video RAM [b8000] to have them as
> quickly as possible).
>
> Regards,

Regards

--
Julian Anastasov <ja@xxxxxx>



<Prev in Thread] Current Thread [Next in Thread>