LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] KeepAlived + LVS NAT + UDP DNS + Multiple Ext VIPS = All

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] KeepAlived + LVS NAT + UDP DNS + Multiple Ext VIPS = All of a sudden, responses go out on wrong VIP
Cc: keepalived-devel@xxxxxxxxxxxxxxxxxxxxx
From: Tom <tom@xxxxxxxx>
Date: Wed, 11 Jan 2012 08:06:43 +0000
I may have identified a difference between my load balancers, and I'm 
thinking that this is just happening on one of the load balancers too, 
as I failed over yesterday, and found that the same large provider that 
had complained last week were still suffering the problem with replies 
coming back from the wrong external VIP, but it's been working fine from 
the SLAVE since then.  Perhaps you could comment on whether these 
settings are pertinent in this case.

Basically, the dodgy load balancer has the following set:

net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0

The working LB has both of these set to 1.

Now, I don't think it's bad usually to disable these ICMP redirects, but 
it's literally the only difference I've managed to find between the two 
load balancers.

Does anyone have any ideas.

My investigations continue.

Regards.  Tom.

On 09/01/12 14:53, Tom wrote:
> Hi Guys,
>
> I'm sending this to both LVS and Keepalived mailings lists, as both 
> technologies are involved here, and I'm not sure where the failure 
> might be.  It's conceivable that it should go to the netfilter list 
> too, but perhaps you can advise me on that.
>
> The basic gist of the problem is that my DNS cluster has multiple 
> external VIPS, being highly available using keepalived.  DNS requests 
> are load balanced across 6 back-end real servers.  In most cases, DNS 
> requests come in on one of three external VIPS (the 88.88.192.250, 
> 193.250 and 192.254 addresses), and the responses go back out from 
> those IPs, although I'm not sure if the LVS NAT or iptables NAT is 
> taking care of that side of things - it's just magic.
>
> A few times now, we have had big external providers come to us saying 
> that no requests to our nameservers were working.  Looking in to it, 
> it looks like we're replying on the wrong VIPs (ie, not the one that 
> the requests came in on).  If you have a look at the attached 
> tcpdump.txt, you'll see one provider (Virgin Media), constantly 
> getting responses from the wrong IPs.  This doesn't happen forever 
> however, it just seems to go really badly wrong for an extended 
> period!  The tcpdump doesn't take in to account the traffic to the 
> real servers unfortunately.  I'm just waiting for this to happen again 
> so I can get a dump from the LB to the real servers in parallel to 
> compare (although I'm praying that it doesn't happen again at the same 
> time..)
>
> Could anyone advise me why the NAT/connection tracking has been 
> failing in these cases?
>
> Please see my keepalived.conf attached (although some information has 
> been replaced)
>
> *Here are my IP addresses:*
>
> [root@dns-lb-02 ~]# ip a s
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc 
> pfifo_fast master bond0 qlen 1000
>     link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
> 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc 
> pfifo_fast master bond0 qlen 1000
>     link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
> 4: sit0: <NOARP> mtu 1480 qdisc noop
>     link/sit 0.0.0.0 brd 0.0.0.0
> 5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>     link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
>     inet6 fe80::222:19ff:fe57:97ca/64 scope link
>        valid_lft forever preferred_lft forever
> 8: bond0.192@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 
> qdisc noqueue
>     link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
>     inet 88.88.192.36/23 brd 88.88.193.255 scope global bond0.192
>     inet 88.88.192.250/23 scope global secondary bond0.192
>     inet 88.88.193.250/23 scope global secondary bond0.192
>     inet 88.88.192.254/23 scope global secondary bond0.192
>     inet 88.88.192.37/23 scope global secondary bond0.192
>     inet6 fe80::222:19ff:fe57:97ca/64 scope link
>        valid_lft forever preferred_lft forever
> 9: bond0.81@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 
> qdisc noqueue
>     link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
>     inet 10.44.81.110/24 brd 10.44.81.255 scope global bond0.81
>     inet 10.44.81.108/24 scope global secondary bond0.81
>     inet6 fe80::222:19ff:fe57:97ca/64 scope link
>        valid_lft forever preferred_lft forever
>
> *routing table:*
>
> [root@dns-lb-02 ~]# ip ro s
> 10.44.81.0/24 dev bond0.81  proto kernel  scope link  src 10.44.81.110
> 88.88.192.0/23 dev bond0.192  proto kernel  scope link  src 88.88.192.36
> 169.254.0.0/16 dev bond0.81  scope link
> 10.216.0.0/16 via 10.44.81.1 dev bond0.81
> 10.44.0.0/16 via 10.44.81.1 dev bond0.81
> default via 88.88.192.1 dev bond0.192
>
>
> *This is my iptables nat table:*
>
> [root@dns-lb-02 ~]# iptables -t nat -nL -v
> Chain PREROUTING (policy ACCEPT 1032M packets, 82G bytes)
>  pkts bytes target     prot opt in     out     source               
> destination
>
> Chain POSTROUTING (policy ACCEPT 23M packets, 1372M bytes)
>  pkts bytes target     prot opt in     out     source               
> destination
>  7378  592K MASQUERADE  all  --  *      bond0.192  
> 0.0.0.0/0            0.0.0.0/0
>
> Chain OUTPUT (policy ACCEPT 23M packets, 1372M bytes)
>  pkts bytes target     prot opt in     out     source               
> destination
>
>
> *here is my IPVS table:*
>
> [root@dns-lb-02 ~]# ipvsadm -Ln
> IP Virtual Server version 1.2.1 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
>   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
> UDP  88.88.192.37:53 lc persistent 600
>   -> 10.44.81.151:53              Masq    10     0          0
>   -> 10.44.81.153:53              Masq    10     0          0
>   -> 10.44.81.152:53              Masq    10     0          0
>   -> 10.44.81.154:53              Masq    10     0          0
>   -> 10.44.81.155:53              Masq    10     0          0
>   -> 10.44.81.150:53              Masq    10     0          0
> TCP  88.88.192.37:53 lc persistent 600
>   -> 10.44.81.155:53              Masq    10     0          0
>   -> 10.44.81.151:53              Masq    10     0          0
>   -> 10.44.81.153:53              Masq    10     0          0
>   -> 10.44.81.152:53              Masq    10     0          0
>   -> 10.44.81.154:53              Masq    10     0          0
>   -> 10.44.81.150:53              Masq    10     0          0
> TCP  88.88.193.250:53 lc persistent 600
>   -> 10.44.81.155:53              Masq    10     0          10
>   -> 10.44.81.151:53              Masq    10     0          6
>   -> 10.44.81.153:53              Masq    10     0          5
>   -> 10.44.81.152:53              Masq    10     0          6
>   -> 10.44.81.154:53              Masq    10     0          4
>   -> 10.44.81.150:53              Masq    10     0          5
> TCP  88.88.192.250:53 lc persistent 600
>   -> 10.44.81.155:53              Masq    10     0          5
>   -> 10.44.81.151:53              Masq    10     0          7
>   -> 10.44.81.152:53              Masq    10     0          6
>   -> 10.44.81.153:53              Masq    10     0          6
>   -> 10.44.81.154:53              Masq    10     0          6
>   -> 10.44.81.150:53              Masq    10     0          5
> TCP  88.88.192.254:53 lc persistent 600
>   -> 10.44.81.155:53              Masq    10     0          7
>   -> 10.44.81.151:53              Masq    10     1          5
>   -> 10.44.81.153:53              Masq    10     0          6
>   -> 10.44.81.152:53              Masq    10     0          5
>   -> 10.44.81.154:53              Masq    10     0          5
>   -> 10.44.81.150:53              Masq    10     0          5
> UDP  88.88.192.254:53 lc persistent 600
>   -> 10.44.81.151:53              Masq    10     0          23976
>   -> 10.44.81.155:53              Masq    10     0          23961
>   -> 10.44.81.150:53              Masq    10     0          23966
>   -> 10.44.81.153:53              Masq    10     0          23969
>   -> 10.44.81.154:53              Masq    10     0          23969
>   -> 10.44.81.152:53              Masq    10     0          23985
> UDP  88.88.193.250:53 lc persistent 600
>   -> 10.44.81.151:53              Masq    10     0          49915
>   -> 10.44.81.155:53              Masq    10     0          49916
>   -> 10.44.81.153:53              Masq    10     0          50559
>   -> 10.44.81.154:53              Masq    10     0          49982
>   -> 10.44.81.152:53              Masq    10     0          50210
>   -> 10.44.81.150:53              Masq    10     0          49945
> UDP  88.88.192.250:53 lc persistent 600
>   -> 10.44.81.151:53              Masq    10     0          48668
>   -> 10.44.81.152:53              Masq    10     0          48668
>   -> 10.44.81.154:53              Masq    10     0          48686
>   -> 10.44.81.155:53              Masq    10     0          48650
>   -> 10.44.81.153:53              Masq    10     0          49025
>   -> 10.44.81.150:53              Masq    10     0          48655
>
> *This is the ip/route information from a real server:
>
> *[root@dns-be-01 ~]# ip a s
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
> qlen 1000
>     link/ether 00:22:19:57:97:cf brd ff:ff:ff:ff:ff:ff
>     inet 10.44.81.150/24 brd 10.44.81.255 scope global eth0
>     inet6 fe80::222:19ff:fe57:97cf/64 scope link
>        valid_lft forever preferred_lft forever
> 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
>     link/ether 00:22:19:57:97:d1 brd ff:ff:ff:ff:ff:ff
> 4: sit0: <NOARP> mtu 1480 qdisc noop
>     link/sit 0.0.0.0 brd 0.0.0.0
> [root@dns-be-01 ~]# ip ro s
> 10.44.81.0/24 dev eth0  proto kernel  scope link  src 10.44.81.150
> 10.216.0.0/16 via 10.44.81.1 dev eth0
> 10.44.0.0/16 via 10.44.81.1 dev eth0
> default via 10.44.81.108 dev eth0
>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>