LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] Connection sync breaks fwmark-based localnode setup

To: svensven <svensven@xxxxxxxxx>
Subject: Re: [lvs-users] Connection sync breaks fwmark-based localnode setup
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Simon Horman <horms@xxxxxxxxxxxx>
Date: Mon, 29 Mar 2010 15:18:41 +1100
On Sun, Mar 28, 2010 at 12:31:20PM +0200, svensven wrote:
> In short: ip_vs_conn_in_get() does not match on fwmark, so incoming
> packets to the backup LVS that were forwarded from the master LVS will
> match a synchronized connection and thus be sent through ipvs on the
> backup LVS, which is also the destination realserver. ipvs will loop
> the packet, causing the node to hang. Without conn sync, the nodes
> work fine (though of course breaking existing connections when failing
> over). Tested on Linux 2.6.33.
> 
> Here's my setup:
> 
>       client ----+
>      10.0.0.3    | vip: 10.0.0.10
>                 / \
>                /   \
>    +------------+ +------------+
>    | LVS A (mst)| | LVS B (bkp)|
>    |Realserver A| |Realserver B|
>    |  10.0.0.5  | |  10.0.0.6  |
>    +------------+ +------------+
> 
> Both nodes are set up with the vip on lo:10, an iptables rule to set
> the fwmark if the request does not come from the other LVS and
> arp_ignore=1, arp_announce=2 on all interfaces. See net/iptables/
> sysctl config for LVS master [3] and backup [4]. The realservers run
> lighttpd on port 9999 and bind to 0.0.0.0.
> 
> Both nodes have an identical keepalived.conf, except for the priority.
> See full keepalived.conf for LVS A [5]. The important parts of it are
> shown below:
> 
>    virtual_server fwmark 10 {
>        lb_algo rr
>        lb_kind DR
>        real_server 10.0.0.5 9999 {...}
>        real_server 10.0.0.6 9999 {...}
>    }
> 
> The config includes notify_master/notify_backup scripts that
> start/stop the ipvs connection synchronization daemon. For testing
> purposes, the sync threshold is tweaked to sync after the TCP 3-way
> handshake is done (2 incoming packets seen: SYN and ACK):
> 
>    net.ipv4.vs.sync_threshold="2 10"
> 
> The debug kernel output in [1] shows how the connection fails when the
> client queries the vip, LVS A is master, and the connection is
> forwarded to realserver B.
> 
> The debug kernel output in [2] shows how the connection works when the
> client queries the vip, LVS B is the master, and the connection is
> forwarded to realserver B (itself), i.e. with no connection
> synchronization.
> 
> 
> Questions:
> 1. Should the ip_vs_conn_in_get() function also take fwmark into
>     consideration when matching incoming packets to its list of
>     established ipvs connections?

I suspect not, as the connection table doesn't include fwmark information.
And I think that there ought to be a simper resolution to your
problem than refactoring connection table entries.

> 2. Is this the right way of setting up a two-node LVS setup with
>     localnodes and connection synchronization on a modern kernel?
>     (Assuming the conn sync would not break)

I think that you could get around this problem by only activating
the LVS rules on the master-node. Or is that already the case?

> thanks!
> S.
> 
> ***
> 
> [1]: Example of fail
> LVS A is master, balances to realserver B.
> The output below is from LVS B / realserver B kern.log after:
> * adding LOG entries to iptables -t filter, chain INPUT and OUTPUT
> * setting net.ipv4.vs.debug_level to 13 (max)
> * stripping away some crud, cleaning timestamps, etc
> * adding <notes> on progress
> 
> Interesting lines: 11, 21, 28
> 
>   1 <Connection from client to VIP>
>   2 [52.351] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 
> SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 SYN
>   3 [52.351] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit
>   4 [52.351] IPVS: lookup/out TCP 10.0.0.3:54590->10.0.0.10:9999 not hit
>   5 [52.351] IPVS: lookup service: fwm 0 TCP 10.0.0.10:9999 not hit
>   6 [52.351] filter-OUTPUT: IN= OUT=eth0 SRC=10.0.0.10 DST=10.0.0.3 
> SPT=9999 DPT=54590 ACK SYN
>   7 [52.457] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 
> SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 ACK
>   8 [52.457] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit
>   9 [52.457] IPVS: lookup/out TCP 10.0.0.3:54590->10.0.0.10:9999 not hit
> 10 <TCP handshake complete>
> 11 <IPVS state is synchronized from MASTER to BACKUP>
> 12 [52.869] IPVS: packet type=2 proto=17 daddr=224.0.0.81 ignored
> 13 [52.869] IPVS: Enter: ip_vs_receive, net/netfilter/ipvs/ip_vs_sync.c 
> line 722
> 14 [52.869] IPVS: Leave: ip_vs_receive, net/netfilter/ipvs/ip_vs_sync.c 
> line 733
> 15 [52.869] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 not hit
> 16 [52.869] IPVS: lookup service: fwm 0 TCP 10.0.0.10:9999 not hit
> 17 [53.353] IPVS: packet type=5 proto=2 daddr=224.0.0.81 ignored
> 18 <One line of data sent from client to VIP>
> 19 [60.906] filter-INPUT : IN=eth0 OUT= MAC=lvsB_mac:lvsA_mac:08:00 
> SRC=10.0.0.3 DST=10.0.0.10 SPT=54590 DPT=9999 ACK PSH
> 20 <Packet matches synchronized state>
> 21 [60.906] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit
> 22 [60.906] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c 
> line 756
> 23 <IPVS forwards the packet to the local interface>
> 24 [60.906] filter-OUTPUT: IN= OUT=lo SRC=10.0.0.3 DST=10.0.0.10 
> SPT=54590 DPT=9999 ACK PSH
> 25 [60.906] IPVS: Leave: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c 
> line 789
> 26 [61.011] filter-INPUT : IN=lo OUT= 
> MAC=00:00:00:00:00:00:00:00:00:00:00:00:08:00 SRC=10.0.0.3 DST=10.0.0.10 
> SPT=54590 DPT=9999 ACK PSH
> 27 <Packet matches synchronized state again ...>
> 28 [61.019] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit
> 29 [61.019] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c 
> line 756

I think that this is critical to the problem. That is ip_vs_dr_xmit()
is being called which causes a loop. I suspect that ip_vs_null_xmit()
should be called and if so the loop wouldn't occur.

Could you post the output of "ipvsadm -Ln" ?

I'm also wondering if this relates to a recent report of Local forwarding
not working since 2.6.28.

http://marc.info/?l=linux-virtual-server&m=126943987132679&w=2

> 30 <IPVS repeats the forwarding in a loop, machine stops responding>
> 31 [61.030] filter-OUTPUT: IN= OUT=lo SRC=10.0.0.3 DST=10.0.0.10 
> SPT=54590 DPT=9999 ACK PSH
> 32 [61.041] IPVS: Leave: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c 
> line 789
> 33 [61.074] filter-INPUT : IN=lo OUT= 
> MAC=00:00:00:00:00:00:00:00:00:00:00:00:08:00 SRC=10.0.0.3 DST=10.0.0.10 
> SPT=54590 DPT=9999 ACK PSH
> 34 [61.083] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit
> 35 [61.084] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c 
> line 756
> 36 <etc, etc>
> 
> Note that the incoming packet is not fwmarked, and that the ipvs
> lookup/in check does not try to match on fwmark.

[snip]

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>