Good morning, Horms,
I got access to Matthias' box tonight and we did run some onsite tests
while I tried to debug this weird behaviour.
lvs2:~# ipvsadm -Lcn
IPVS connection entries
pro expire state source virtual destination
TCP 14:55 ESTABLISHED 10.0.1.62:3558 10.0.1.232:80 10.0.1.30:80
IP 59:53 ERR! 10.0.1.62:0 0.0.0.4:0 10.0.1.30:0
His setup is actually pretty straightforward, and can in short be summed
up with following scriptlet:
------------------
echo "0" > /proc/sys/net/ipv4/ip_forward
echo "0" > /proc/sys/net/ipv4/conf/all/send_redirects
echo "0" > /proc/sys/net/ipv4/conf/eth0/rp_filter
/sbin/iptables -F -t mangle
/sbin/iptables -t mangle -A PREROUTING -p tcp -s 10.0.1.70/32 -d
10.0.1.232/32 --dport 80 -j MARK --set-mark 4
ifconfig eth0:0 10.0.1.232 broadcast 10.0.1.232 netmask 255.255.255.255
ipvsadm -C
rmmod ip_vs_wrr
rmmod ip_vs
ipvsadm -A -f 4 -s wrr -p 3600
ipvsadm -a -f 4 -r 10.0.1.33 -g -w 100
ipvsadm -a -f 4 -r 10.0.1.30 -g -w 100
echo 666 > /proc/sys/net/ipv4/vs/debug_level
------------------
This yields:
lvs2:~# iptables -t mangle -L PREROUTING -n
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
MARK tcp -- 10.0.1.70 10.0.1.232 tcp dpt:80
MARK set 0x4
lvs2:~# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
FWM 4 wrr persistent 3600
-> 10.0.1.30:0 Route 100 0 0
-> 10.0.1.33:0 Route 100 0 0
Note: the request ist made from within the collision domain of the LVS
cluster (10.0.1.0/24) which is a no-go normally, however I still
wouldn't expect LVS to fail like that.
Upon first packet entry which comes from the 10.0.1.70 source we get
this entry:
IPVS connection entries
pro expire state source virtual destination
IP 59:59 ERR! 10.0.1.70:0 0.0.0.4:0 10.0.1.30:0
It kind of works, however the packet got mundged. It seems to only
happen if fwmark is involved. It's like the packet is read backwards or
we're missing some BE/LE conversion. As you can see the SIP and RIP are
correctly displayed. The corresponding debug entries are:
Mar 9 23:13:54 lvs2 kernel: IPVS: lookup/in TCP
10.0.1.70:1487->10.0.1.232:80 not hit
Mar 9 23:13:54 lvs2 kernel: IPVS: lookup service: fwm 4 TCP
10.0.1.232:80 hit
Mar 9 23:13:54 lvs2 kernel: IPVS: p-schedule: src 10.0.1.70:1487 dest
10.0.1.232:80 mnet 10.0.1.70
Mar 9 23:13:54 lvs2 kernel: IPVS: template lookup/in IP
10.0.1.70:0->0.0.0.4:0 not hit
Mar 9 23:13:54 lvs2 kernel: IPVS: ip_vs_wrr_schedule(): Scheduling...
Mar 9 23:13:54 lvs2 kernel: IPVS: WRR: server 10.0.1.30:0 activeconns 0
refcnt 1 weight 100
Mar 9 23:13:54 lvs2 kernel: IPVS: ADDing control for:
cp.dst=10.0.1.70:1487 ctl_cp.dst=10.0.1.70:0
Mar 9 23:13:54 lvs2 kernel: IPVS: lookup/in TCP
10.0.1.70:1487->10.0.1.232:80 hit
60 minutes (persistence timeout) later, roughly, the conn entry template
gets deleted correctly:
Mar 10 00:14:50 lvs2 kernel: IPVS: Unbind-dest IP c:10.0.1.70:0
v:0.0.0.4:0 d:10.0.1.30:0 fwd:R s:0 flg:1183 cnt:1 destcnt:2
Funny enough, the connection timers of the connections belonging to a
template go crazy and drop from 15min (EST) to 3secs when going to
inactive state; no log entry.
I suspect that either the fwmark together with the strange test case
mucks up the karma of the packets when entering IPVS stack or we miss
something really obvious.
I'll talk to/phone Matthias tomorrow (EU time) personally to figure out
some more about his proper network setup. Something is fishy, also
regarding the fact that he very same setup worked ok with 2.4 kernel
according to him. Ohh, here is the machine information:
lvs2:~# cat /etc/debian_version
3.1
lvs2:~# uname -a
Linux lvs2 2.6.15.4 #1 PREEMPT Thu Mar 9 17:56:18 EST 2006 i686 GNU/Linux
lvs2:~# dpkg -l | egrep "kernel|ipvsadm"
ii iptables 1.2.11-10 Linux kernel 2.4+ iptables
administration to
ii ipvsadm 1.24+1.21-1.1 Linux Virtual Server support programs
ii kernel-image-2 101 Linux kernel image for version 2.6 on
PPro/C
ii kernel-image-2 Custom.Lvs.1 Linux kernel binary image for version
2.6.15
ii kernel-image-2 Custom.1 Linux kernel binary image for version
2.6.8.
ii kernel-image-2 2.6.8-16sarge1 Linux kernel image for version 2.6.8
on PPro
ii kernel-package 8.135 A utility for building Linux kernel
related
ii kernel-source- 2.6.8-16sarge1 Linux kernel source for version 2.6.8
with D
ii linux-kernel-h 2.5.999-test7- Linux Kernel Headers for development
ii module-init-to 3.2.2-2 tools for managing Linux kernel modules
If you have some input on what I could check additionally before I throw
myself towards the kernel source, let me know.
Cheers mate,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|