LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: HA-LVS DR ip_finish_output: bad unowned skb

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: HA-LVS DR ip_finish_output: bad unowned skb
Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxx>
Cc: OpenSSI Developers <ssic-linux-devel@xxxxxxxxxxxxxxxxxxxxx>
From: Roger Tsang <roger.tsang@xxxxxxxxx>
Date: Sun, 4 Sep 2005 22:13:16 -0400
Hey guys,

I'm running a streamed inline LVS-DR setup with "sed" scheduler where the 
directors are also realservers in itself. Incoming traffic goes to only one 
of the directors which has the VIP, so the other director is passive (for 
failover). This has worked wonderfully in kernel-2.4. However with 
kernel-2.6's new ipvs code, I see that the passive director is also trying 
to LVS-DR route already loadbalanced packets received from its internal 
(eth1) interface.

This happens if its own weight is lower than the other realservers. If you 
see the passive director's stack (below), it is trying to loadbalance 
packets "back to" the active director with the higher weight.

Depending on how the packets are assigned by weight some of the packets 
would then bounce back and forth between active/passive directors in a loop 
essentially causing a DoS of the entire cluster.

A very brief code comparision between kernel-2.6.13 and kernel-2.4.22's 
ip_vs_in() function shows you guys removed the !sysctl_ip_vs_loadbalance 
if-test to immediately accept packets. How would streamed inline 
loadbalancing work without that test?

/*
* Accept the packet if /proc/sys/net/ipv4/vs/loadbalancing
* is 1 
*/
if(!sysctl_ip_vs_loadbalance) {
return NF_ACCEPT;
}


Roger


---------- Forwarded message ----------
From: Roger Tsang <roger.tsang@xxxxxxxxx>
Date: Sep 4, 2005 8:44 AM
Subject: HA-LVS DR ip_finish_output: bad unowned skb
To: OpenSSI Developers <ssic-linux-devel@xxxxxxxxxxxxxxxxxxxxx>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxx>

Hi,

I'm getting a flood of console messages on the realserver that is receiving 
LVS direct routed packets. The console messages correspond to the SYN_RECV 
entires in the IPVS table. It looks like the director is syn flooding the 
realserver. I'm able to reproduce this by clicking as fast as I can for a 
few seconds on the webpage being served on port 80 - to make sure it hits 
all the realservers. The flood of console messages keeps on going (forever) 
even after I stop generating client traffic.

Apparently this only happens if the virtual server has more than one 
realserver. There are no related console messages on the master director.

This is a streamed inline LVS-DR setup using the OpenSSI-1.9 kernel - aka. 
kernel-2.6.10 - and I've already tried upgrading to kernel-2.6.13's IPVS 
code to see if things will improve.

Anyone familiar with the IPVS code have any ideas? Aneesh?

I really wonder if ip_vs_dr_xmit() belongs in the stack because this 
realserver isn't the active director.

Thanks.

/proc/sys/net/ipv4/vs settings...
am_droprate 10
amemthresh 1024
cache_bypass 0
drop_entry 0
drop_packet 0
expire_nodest_conn 1
expire_quiescent_template 1
nat_icmp_send 0
secure_tcp 1
sync_threshold 3 50

# ipvsadm -ln -c | grep RECV
TCP 00:45 SYN_RECV 10.0.0.1:43685 <http://10.0.0.1:43685>
10.0.0.211:80<http://10.0.0.211:80>
10.117.0.2:80 <http://10.117.0.2:80>
TCP 00:49 SYN_RECV 10.0.0.1:54458 <http://10.0.0.1:54458>
10.0.0.211:80<http://10.0.0.211:80>
10.117.0.2:80 <http://10.117.0.2:80>
TCP 00:03 SYN_RECV 10.0.0.1:43690 <http://10.0.0.1:43690>
10.0.0.211:80<http://10.0.0.211:80>
10.117.0.2:80 <http://10.117.0.2:80>
TCP 00:45 SYN_RECV 10.0.0.1:43687 <http://10.0.0.1:43687>
10.0.0.211:80<http://10.0.0.211:80>
10.117.0.2:80 <http://10.117.0.2:80>

On the realserver's console...
This realserver is not the master director.

<snip>
ip_finish_output: bad unowned skb = d944e220: PRE_ROUTING LOCAL_IN LOCAL_OUT 
POST_ROUTING 
skb: pf=2 (unowned) dev=eth1 len=60
PROTO=6 10.0.0.1:43685 <http://10.0.0.1:43685>
10.0.0.211:80<http://10.0.0.211:80>L=60 S=0x00 I=40906 F=0x4000 T=254
ip_finish_output: bad unowned skb = df824160: PRE_ROUTING LOCAL_IN LOCAL_OUT 
POST_ROUTING 
skb: pf=2 (unowned) dev=eth1 len=60
PROTO=6 10.0.0.1:43687 <http://10.0.0.1:43687>
10.0.0.211:80<http://10.0.0.211:80>L=60 S=0x00 I=31007 F=0x4000 T=254
ip_finish_output: bad unowned skb = deb338c0: PRE_ROUTING LOCAL_IN LOCAL_OUT 
POST_ROUTING 
skb: pf=2 (unowned) dev=eth1 len=60
PROTO=6 10.0.0.1:43685 <http://10.0.0.1:43685>
10.0.0.211:80<http://10.0.0.211:80>L=60 S=0x00 I=40904 F=0x4000 T=254
ip_finish_output: bad unowned skb = d9538ac0: PRE_ROUTING LOCAL_IN LOCAL_OUT 
POST_ROUTING 
skb: pf=2 (unowned) dev=eth1 len=60
PROTO=6 10.0.0.1:43687 <http://10.0.0.1:43687>
10.0.0.211:80<http://10.0.0.211:80>L=60 S=0x00 I=31005 F=0x4000 T=254
ip_finish_output: bad unowned skb = d944e160: PRE_ROUTING LOCAL_IN LOCAL_OUT 
POST_ROUTING 
skb: pf=2 (unowned) dev=eth1 len=60
PROTO=6 10.0.0.1:43687 <http://10.0.0.1:43687>
10.0.0.211:80<http://10.0.0.211:80>L=60 S=0x00 I=31003 F=0x4000 T=254
Entering kdb (current=0xdfc58a80, pid 131413) due to Keyboard Entry
kdb> bt
Stack traceback for pid 131413
0xdfc58a80 131413 2 1 0 R 0xdfc58c40 *nsc_async
EBP EIP Function (args)
0xdeccba14 0xc02e5a35 serial_out+0x25 (0xc064ffc0, 0x1, 0xd, 0x47, 0xd)
0xdeccba3c 0xc02e82eb serial8250_console_write+0x13b (0xc055ed80, 
0xc06284ed, 0x47, 0x130e6, 0x1312d)
0xdeccba5c 0xc011c0a0 __call_console_drivers+0x50 (0x130e6, 0x1312d, 
0x1312d, 0x4)
0xdeccba74 0xc011c13c _call_console_drivers+0x8c (0x130e6, 0x1312d, 0x4, 
0x34000400, 0x130e6)
0xdeccba9c 0xc011c1a9 call_console_drivers+0x69 (0x130e3, 0x1312d, 
0xc0614fe7)
0xdeccbab0 0xc011c5e4 release_console_sem+0x24 (0x34, 0x400, 0xc041fb80, 
0xdeccbaf0, 0x0)
0xdeccbad4 0xc011c526 vprintk+0x196 (0xc041fb80, 0xdeccbaf0)
0xdeccbae4 0xc011c388 printk+0x18 (0xc041fb80, 0x6, 0xa, 0x0, 0x0)
0xdeccbb40 0xc03614cf nf_dump_skb+0x13f (0x2, 0xd944e160, 0xd944e160)
0xdeccbb54 0xc0361698 nf_debug_ip_finish_output2+0x48 (0xd944e160, 
0xc170c6f4, 0xdd736270, 0xdfc546a0, 0xc056afa4)
0xdeccbb78 0xc0375354 ip_finish_output2+0x44 (0xd944e160)
0xdeccbb84 0xc03a3d83 ip_vs_post_routing+0x23 (0x4, 0xdeccbbf8, 0x0, 
0xdfd68c00, 0xc0375310)
0xdeccbbac 0xc0361935 nf_iterate+0x55 (0xc0657540, 0xdeccbbf8, 0x4, 0x0, 
0xdfd68c00)
0xdeccbbe8 0xc0361c71 nf_hook_slow+0x81 (0x2, 0x4, 0xd944e160, 0x0, 
0xdfd68c00)
0xdeccbc0c 0xc037530c ip_finish_output+0x4c (0xd944e160, 0xdfd68c00)
0xdeccbc1c 0xc03756bb ip_output+0x4b (0xd944e160, 0x0)
0xdeccbc2c 0xc03a8193 dst_output+0x13 (0xd944e160, 0xdeccbc78, 0x3, 0x0, 
0xdfd68c00)
0xdeccbc68 0xc0361d30 nf_hook_slow+0x140 (0x2, 0x3, 0xd944e160, 0x0)
0xdeccbd4c 0xc03a8e40 ip_vs_dr_xmit+0x150 (0x80000000, 0xc1669ee0, 
0xd792b480, 0x0, 0xd274c010)
more> 
Only 'q' or 'Q' are processed at more prompt, input ignored
0xc03a8180 dst_output (0xd944e160, 0xd63480e0, 0xc056ba60, 0xc056ba60, 0x0)
0xc03a4913 ip_vs_in+0x193 (0x1, 0xdeccbdf8, 0xdfd68c00, 0x0, 0xc0372640)
0xdeccbdac 0xc0361935 nf_iterate+0x55 (0xc0657528, 0xdeccbdf8, 0x1, 
0xdfd68c00, 0x0)
0xdeccbde8 0xc0361c71 nf_hook_slow+0x81 (0x2, 0x1, 0xd944e160, 0xdfd68c00, 
0x0)
0xdeccbe0c 0xc037263b ip_local_deliver+0x5b (0xd944e160, 0xd300000a, 
0x100000a, 0x0, 0xdfd68c00)
0xdeccbe44 0xc0372b33 ip_rcv_finish+0x163 (0xd944e160, 0xdeccbe90, 0x0, 
0xdfd68c00, 0x0)
0xdeccbe80 0xc0361d30 nf_hook_slow+0x140 (0x2, 0x0, 0xd944e160, 0xdfd68c00, 
0x0)
0xdeccbeb0 0xc0372945 ip_rcv+0x1b5 (0xd944e160, 0xdfd68c00, 0xc056883c, 0x8, 
0xdfd68c00)
0xdeccbee0 0xc03567c3 netif_receive_skb+0x143 (0xd944e160, 0x10bee3, 
0xc0656804, 0xc0656708, 0x10bee3)
0xdeccbefc 0xc0356875 process_backlog+0x75 (0xc0656708, 0xdeccbf0c, 0x2ea, 
0x1, 0xc0635538)
0xdeccbf1c 0xc035696a net_rx_action+0x6a (0xc0635538, 0x246, 0x2381, 0x7a9)
0xdeccbf34 0xc011fd05 __do_softirq+0x45 (0xfe)
0xdeccbf40 0xc011fd7a do_softirq+0x2a (0xc06141e0, 0xf433b, 0x708f, 
0x4f7a80)
0xdeccbf58 0xc011fdee local_bh_enable+0x6e (0x1, 0xfe, 0x1b16, 0x0, 
0xd22fde80)
0xdeccbf7c 0xc0289ee5 load_balance+0xd5 (0x0, 0xdecca000, 0xdfc54a30, 
0xdfc54a28, 0xffffffff)
0xdeccbfec 0xc01fbf53 nsc_async_daemon+0x173
0xc0100995 kernel_thread_helper+0x5


Roger

<Prev in Thread] Current Thread [Next in Thread>