Hey Julian,
Long time no talk ;).
I got access to Matthias' box tonight and we did run some onsite tests
while I tried to debug this weird behaviour.
lvs2:~# ipvsadm -Lcn
IPVS connection entries
pro expire state source virtual destination
15min TCP for connection:
TCP 14:55 ESTABLISHED 10.0.1.62:3558 10.0.1.232:80 10.0.1.30:80
One hour for template (-p 3600):
Exactly.
IP 59:53 ERR! 10.0.1.62:0 0.0.0.4:0 10.0.1.30:0
The fwmark-based templates have vaddr set to the fwmark value,
that is why we see 0.0.0.4:0, it is for "IP" (not checked, we can
forward any protocol by using this template) and state is "ERR!"
because we don't maintain state for templates. May be 2.6 is different,
it tries to display per-protocol state and the templates don't
have protocol.
D'oh, of course! But this is misleading. IMHO it should display
something like fwmark 4 but not ERR! and 0.0.0.4:0, since the whole
thing is actually working besides this entry which got Matthias and me
carried away. I haven't used fwmark in ages ...
10.0.1.232/32 --dport 80 -j MARK --set-mark 4
ipvsadm -A -f 4 -s wrr -p 3600
Upon first packet entry which comes from the 10.0.1.70 source we get
this entry:
IPVS connection entries
pro expire state source virtual destination
IP 59:59 ERR! 10.0.1.70:0 0.0.0.4:0 10.0.1.30:0
It kind of works, however the packet got mundged. It seems to only
happen if fwmark is involved. It's like the packet is read backwards or
we're missing some BE/LE conversion. As you can see the SIP and RIP are
correctly displayed. The corresponding debug entries are:
Any problem with traffic or just the template looks ugly?
It's just the template :). The rest is working perfectly, besides some
weird timeouts from time to time which I will investigate personally
together with Matthias.
Funny enough, the connection timers of the connections belonging to a
template go crazy and drop from 15min (EST) to 3secs when going to
inactive state; no log entry.
This needs to be investigated, is there a 3-second timeout?
I'll dump traffic and ipvsadm stats again and let you know.
I'll talk to/phone Matthias tomorrow (EU time) personally to figure out
some more about his proper network setup. Something is fishy, also
regarding the fact that he very same setup worked ok with 2.4 kernel
according to him. Ohh, here is the machine information:
Hm, what is not working? Only the template listing?
Yes, sorry for not being specific enough. Template listing is the only
thing that is ugly, IPVS otherwise is working. I simply wasn't aware
anymore that there is not template entry for fwmark'd services.
Thanks for the heads-up and have a nice weekend, Julian,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|