Hi Guys,
I'm sending this to both LVS and Keepalived mailings lists, as both
technologies are involved here, and I'm not sure where the failure might
be. It's conceivable that it should go to the netfilter list too, but
perhaps you can advise me on that.
The basic gist of the problem is that my DNS cluster has multiple
external VIPS, being highly available using keepalived. DNS requests
are load balanced across 6 back-end real servers. In most cases, DNS
requests come in on one of three external VIPS (the 88.88.192.250,
193.250 and 192.254 addresses), and the responses go back out from those
IPs, although I'm not sure if the LVS NAT or iptables NAT is taking care
of that side of things - it's just magic.
A few times now, we have had big external providers come to us saying
that no requests to our nameservers were working. Looking in to it, it
looks like we're replying on the wrong VIPs (ie, not the one that the
requests came in on). If you have a look at the attached tcpdump.txt,
you'll see one provider (Virgin Media), constantly getting responses
from the wrong IPs. This doesn't happen forever however, it just seems
to go really badly wrong for an extended period! The tcpdump doesn't
take in to account the traffic to the real servers unfortunately. I'm
just waiting for this to happen again so I can get a dump from the LB to
the real servers in parallel to compare (although I'm praying that it
doesn't happen again at the same time..)
Could anyone advise me why the NAT/connection tracking has been failing
in these cases?
Please see my keepalived.conf attached (although some information has
been replaced)
*Here are my IP addresses:*
[root@dns-lb-02 ~]# ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast master bond0 qlen 1000
link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast master bond0 qlen 1000
link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
4: sit0: <NOARP> mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
inet6 fe80::222:19ff:fe57:97ca/64 scope link
valid_lft forever preferred_lft forever
8: bond0.192@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500
qdisc noqueue
link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
inet 88.88.192.36/23 brd 88.88.193.255 scope global bond0.192
inet 88.88.192.250/23 scope global secondary bond0.192
inet 88.88.193.250/23 scope global secondary bond0.192
inet 88.88.192.254/23 scope global secondary bond0.192
inet 88.88.192.37/23 scope global secondary bond0.192
inet6 fe80::222:19ff:fe57:97ca/64 scope link
valid_lft forever preferred_lft forever
9: bond0.81@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500
qdisc noqueue
link/ether 00:22:19:57:97:ca brd ff:ff:ff:ff:ff:ff
inet 10.44.81.110/24 brd 10.44.81.255 scope global bond0.81
inet 10.44.81.108/24 scope global secondary bond0.81
inet6 fe80::222:19ff:fe57:97ca/64 scope link
valid_lft forever preferred_lft forever
*routing table:*
[root@dns-lb-02 ~]# ip ro s
10.44.81.0/24 dev bond0.81 proto kernel scope link src 10.44.81.110
88.88.192.0/23 dev bond0.192 proto kernel scope link src 88.88.192.36
169.254.0.0/16 dev bond0.81 scope link
10.216.0.0/16 via 10.44.81.1 dev bond0.81
10.44.0.0/16 via 10.44.81.1 dev bond0.81
default via 88.88.192.1 dev bond0.192
*This is my iptables nat table:*
[root@dns-lb-02 ~]# iptables -t nat -nL -v
Chain PREROUTING (policy ACCEPT 1032M packets, 82G bytes)
pkts bytes target prot opt in out source
destination
Chain POSTROUTING (policy ACCEPT 23M packets, 1372M bytes)
pkts bytes target prot opt in out source
destination
7378 592K MASQUERADE all -- * bond0.192 0.0.0.0/0
0.0.0.0/0
Chain OUTPUT (policy ACCEPT 23M packets, 1372M bytes)
pkts bytes target prot opt in out source
destination
*here is my IPVS table:*
[root@dns-lb-02 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
UDP 88.88.192.37:53 lc persistent 600
-> 10.44.81.151:53 Masq 10 0 0
-> 10.44.81.153:53 Masq 10 0 0
-> 10.44.81.152:53 Masq 10 0 0
-> 10.44.81.154:53 Masq 10 0 0
-> 10.44.81.155:53 Masq 10 0 0
-> 10.44.81.150:53 Masq 10 0 0
TCP 88.88.192.37:53 lc persistent 600
-> 10.44.81.155:53 Masq 10 0 0
-> 10.44.81.151:53 Masq 10 0 0
-> 10.44.81.153:53 Masq 10 0 0
-> 10.44.81.152:53 Masq 10 0 0
-> 10.44.81.154:53 Masq 10 0 0
-> 10.44.81.150:53 Masq 10 0 0
TCP 88.88.193.250:53 lc persistent 600
-> 10.44.81.155:53 Masq 10 0 10
-> 10.44.81.151:53 Masq 10 0 6
-> 10.44.81.153:53 Masq 10 0 5
-> 10.44.81.152:53 Masq 10 0 6
-> 10.44.81.154:53 Masq 10 0 4
-> 10.44.81.150:53 Masq 10 0 5
TCP 88.88.192.250:53 lc persistent 600
-> 10.44.81.155:53 Masq 10 0 5
-> 10.44.81.151:53 Masq 10 0 7
-> 10.44.81.152:53 Masq 10 0 6
-> 10.44.81.153:53 Masq 10 0 6
-> 10.44.81.154:53 Masq 10 0 6
-> 10.44.81.150:53 Masq 10 0 5
TCP 88.88.192.254:53 lc persistent 600
-> 10.44.81.155:53 Masq 10 0 7
-> 10.44.81.151:53 Masq 10 1 5
-> 10.44.81.153:53 Masq 10 0 6
-> 10.44.81.152:53 Masq 10 0 5
-> 10.44.81.154:53 Masq 10 0 5
-> 10.44.81.150:53 Masq 10 0 5
UDP 88.88.192.254:53 lc persistent 600
-> 10.44.81.151:53 Masq 10 0 23976
-> 10.44.81.155:53 Masq 10 0 23961
-> 10.44.81.150:53 Masq 10 0 23966
-> 10.44.81.153:53 Masq 10 0 23969
-> 10.44.81.154:53 Masq 10 0 23969
-> 10.44.81.152:53 Masq 10 0 23985
UDP 88.88.193.250:53 lc persistent 600
-> 10.44.81.151:53 Masq 10 0 49915
-> 10.44.81.155:53 Masq 10 0 49916
-> 10.44.81.153:53 Masq 10 0 50559
-> 10.44.81.154:53 Masq 10 0 49982
-> 10.44.81.152:53 Masq 10 0 50210
-> 10.44.81.150:53 Masq 10 0 49945
UDP 88.88.192.250:53 lc persistent 600
-> 10.44.81.151:53 Masq 10 0 48668
-> 10.44.81.152:53 Masq 10 0 48668
-> 10.44.81.154:53 Masq 10 0 48686
-> 10.44.81.155:53 Masq 10 0 48650
-> 10.44.81.153:53 Masq 10 0 49025
-> 10.44.81.150:53 Masq 10 0 48655
*This is the ip/route information from a real server:
*[root@dns-be-01 ~]# ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
qlen 1000
link/ether 00:22:19:57:97:cf brd ff:ff:ff:ff:ff:ff
inet 10.44.81.150/24 brd 10.44.81.255 scope global eth0
inet6 fe80::222:19ff:fe57:97cf/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:22:19:57:97:d1 brd ff:ff:ff:ff:ff:ff
4: sit0: <NOARP> mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
[root@dns-be-01 ~]# ip ro s
10.44.81.0/24 dev eth0 proto kernel scope link src 10.44.81.150
10.216.0.0/16 via 10.44.81.1 dev eth0
10.44.0.0/16 via 10.44.81.1 dev eth0
default via 10.44.81.108 dev eth0
keepalived.conf
Description: Text document
tcpdump.txt
Description: Text document
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|