Hi,
We've been using ipvs successfully for about a year now. We have several
instances of it, ranging from about 3k packet/s to about 50k packets/s at
peak.
Recently for one of the director pairs, we needed to enable persistance.
This worked fine, until a backend server was removed for whatever reason.
Anything that caused a backend server to be removed (machine not
responsive, ldirectord.conf edited, server manually removed with
ipvsadm, etc) caused the panic.
This was repeatable, unfortunately. It happened every time a backend
machine was nonresponsive, so we had quite a few failures.
The persistance was set to 8 hours (28800 seconds) because of the needs
of the backend application.
This particular instance has about 18-20k packets/s, a pair of directors
running 2.4.25, both pIII 1.3ghz machines, ldirectord 1.69, debian stable,
DR method. There's a peak of about 9Mbit in and out of the directors, 15
backend machines, with a peak of around 115Mbit aggregate outgoing.
All I was able to get of the panic is below. If there's any more
information I can provide please let me know.
Thanks!
Moses
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c026bd66>] Not tainted
EFLAGS: 00010246
eax: 00000000 ebx: f4dc49a0 ecx: 00000000 edx: 00000000
esi: 00000000 edi: f4dc49a0 ebp: 96e65a18 esp: c031fd98
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c031f000)
Stack: f4dc49a0 c026d0a4 f4dc49a0 f70e38c0 c026db92 f4dc49a0 f70e95a0
00000000
f70e38c0 e9ba7800 f70e38d4 c026de64 f70e95a0 f70e38c0 f70e95a0
00000000
f70e38c0 c026ef27 f70e95a0 f70e38c0 c031fe58 c0391b48 c0229090
00000000
Call Trace: [<c026d0a4>] [<c026db92>] [<c026de64>] [<c026ef27>]
[<c0229090>]
[<c0220f4c>] [<c0229090>] [<c0229090>] [<c022129f>] [<c0229090>]
[<c02291c0>]
[<c0228cd5>] [<c0229090>] [<c02291c0>] [<c022935a>] [<c02291c0>]
[<c02212d8>]
[<c0229046>] [<c02291c0>] [<c021937b>] [<c0219429>] [<c021955e>]
[<c011e88d>]
[<c010881d>] [<c01052b0>] [<c01052b0>] [<c010acc8>] [<c01052b0>]
[<c01052b0>]
[<c01052dc>] [<c0105342>] [<c0105000>] [<c0105050>]
Code: 89 50 04 89 02 0f b7 43 40 c7 03 00 00 00 00 c7 43 04 00 00
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
|