I have a strange behaviour with my lvs loadbalancer and sip connections.
Actually, everything works fine, as long as I don't unplug one of my 2 sip
proxies.
Ldirectord determines correctly, that RIP1 does not respond and removes the
entry from the lvs table. From now on, every request to VIP should be
internally routed to RIP2. But this is not the case for all Clients. Some
Clients, which had a connection with RIP1 before, still get directed to
RIP1.
Most probably due to the template invalidation bug, which got fixed
recently.
The ipvs table and the connection table look fine for me:
lvs:~# ipvsadm -l -n
TCP VIP:5060 wlc persistent 120 mask 255.255.255.0
-> RIP2:5060 Masq 1 0 0
lvs:~# ipvsadm -l -n -c
IPVS connection entries
pro expire state source virtual destination
UDP 04:22 UDP Client1:5060 VIP:5060
RIP2:5060
UDP 00:47 UDP Client2:5060 VIP:5060
RIP2:5060
UDP 00:36 NONE Client3:0 VIP:5060
RIP2:5060
Anything should go to RIP2 only. But a tcpdump shows that it does not :(
Schlach dem tcpdump 'mal en an den Kopp :).
So I searched a long time for a reason.
I already set the following flags:
net.ipv4.vs.expire_nodest_conn = 1
Since you have this flag set and still observe the problem described
above, I'm quite convinced that you've hit the template bug fixed in
recent kernels.
net.ipv4.vs.secure_tcp = 3
net.ipv4.vs.timeout_finwait = 2
This could expose another bug, which Horms found recently but is not
related to yours, I reckon.
"udp 17 28 src=DIP_Internal_IP dst=RIP1 sport=35356 dport=5060
[UNREPLIED] src=RIP1 dst=DIP_Internal_IP sport=5060 dport=35356 use=1"
"udp 17 164 src=RIP1 dst=Client sport=5060 dport=12500 src=Client
dst=VIP sport=12500 dport=5060 [ASSURED] use=1"
^^ If one of the above entries exists, then the traffic for this client
still gets routed to RIP1. The entries get removed by some timeout after a
while. From this time on the packets are routed correctly to RIP2.
Hmm, maybe you should not use additional patches, there are enough
states and timers already in IPVS.
Well, I don't really have a clue right now. I patched my kernel with the
ipvs_nfct patch a while ago. It does not work for me and I also do not
really need it.
Drop it then ;).
Could it be, that this patch is somehow responable for the
entries in the ip_conntrack table? Does ipvs normally use the ip_conntrack
table?
Nope. It maintains own internal structures.
Has someone had a similar behaviour and possibly knows a solution?
Upgrade your kernel to version 2.4.32 and try again. If it still
happens, come back and we'll happily debug it.
ipvsadm v1.21
IPVS v1.0.12
Vanilla Kernel 2.4.31 with ipvs_nfct Patch
Yes, 2.4.31 has this bug, I'm afraid.
HTH and best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc
|