LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] keepalived: LVS-DR split brain w/firewalls up

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] keepalived: LVS-DR split brain w/firewalls up
From: Gerry Reno <greno@xxxxxxxxxxx>
Date: Sun, 29 Jul 2007 13:51:04 -0400
Joseph Mack NA3T wrote:
> On Sun, 29 Jul 2007, Gerry Reno wrote:
>
>   
>> Joseph Mack NA3T wrote:
>>     
>>> how have you stopped the two directors from talking to each
>>> other?
>>>
>>> Joe
>>>
>>>       
>> I was hoping someone could tell me.
>>     
>
> look up the docs for the failover package you're using.
>
> Joe
>   
Of course, I've been going through the keepalived docs and the mailing 
list. The only thing I found was reference to 224.0.0.0/8, so I added 
the following to the firewalls:

iptables -A RH-Firewall-1-INPUT -s 224.0.0.0/8 -d 0/0 -j ACCEPT
iptables -A RH-Firewall-1-INPUT -d 224.0.0.0/8 -s 0/0 -j ACCEPT
iptables -A OUTPUT -s 224.0.0.0/8 -d 0/0 -j ACCEPT
iptables -A OUTPUT -d 224.0.0.0/8 -s 0/0 -j ACCEPT

but these did not help either.

Here is the log from the MASTER director with FW = UP:

Jul 29 12:28:23 grp-01-00-50 Keepalived: Starting Keepalived v1.1.13 
(03/26,2007)
Jul 29 12:28:23 grp-01-00-50 Keepalived: Starting Healthcheck child 
process, pid=30086
Jul 29 12:28:23 grp-01-00-50 Keepalived: Starting VRRP child process, 
pid=30087
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Using MII-BMSR NIC polling 
thread...
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Netlink reflector reports 
IP 192.168.1.150 added
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Registering Kernel netlink 
reflector
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Registering Kernel netlink 
command channel
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Registering gratutious ARP 
shared channel
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: Configuration is using : 
35690 Bytes
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Using MII-BMSR 
NIC polling thread...
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Netlink 
reflector reports IP 192.168.1.150 added
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Registering 
Kernel netlink reflector
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Registering 
Kernel netlink command channel
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Configuration is 
using : 20835 Bytes
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:22]
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:22]
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:80]
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:80]
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:443]
Jul 29 12:28:23 grp-01-00-50 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:443]
Jul 29 12:28:23 grp-01-00-50 Keepalived_vrrp: VRRP sockpool: 
[ifindex(2), proto(112), fd(8,9)]
Jul 29 12:28:23 grp-01-00-50 kernel: IPVS: sync thread started: state = 
MASTER, mcast_ifn = eth0, syncid = 25
Jul 29 12:28:25 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) 
Transition to MASTER STATE
Jul 29 12:28:27 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) 
Entering MASTER STATE
Jul 29 12:28:27 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) 
setting protocol VIPs.
Jul 29 12:28:27 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) 
Sending gratuitous ARPs on eth0 for 192.168.1.240
Jul 29 12:28:27 grp-01-00-50 Keepalived_vrrp: Netlink: skipping nl_cmd 
msg...
Jul 29 12:28:27 grp-01-00-50 Keepalived_healthcheckers: Netlink 
reflector reports IP 192.168.1.240 added
Jul 29 12:28:27 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 /sbin/ip addr add 
192.168.1.240/32 dev lo brd + scope host
Jul 29 12:28:28 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): RTNETLINK answers: File exists
Jul 29 12:28:28 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 echo "1" > 
/proc/sys/net/ipv4/conf/eth0/arp_ignore
Jul 29 12:28:28 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 echo "2" > 
/proc/sys/net/ipv4/conf/eth0/arp_announce
Jul 29 12:28:28 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 "/sbin/route del default; 
/sbin/route add default gw 192.168.1.1"
Jul 29 12:28:32 grp-01-00-50 Keepalived_vrrp: VRRP_Instance(VI_1) 
Sending gratuitous ARPs on eth0 for 192.168.1.240
Jul 29 12:28:34 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): bash: /sbin/route del default; /sbin/route add 
default gw 192.168.1.1: No such file or directory
Jul 29 12:28:34 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 /sbin/ip addr add 
192.168.1.240/32 dev lo brd + scope host
Jul 29 12:28:34 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): RTNETLINK answers: File exists
Jul 29 12:28:34 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 echo "1" > 
/proc/sys/net/ipv4/conf/eth0/arp_ignore
Jul 29 12:28:34 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 echo "2" > 
/proc/sys/net/ipv4/conf/eth0/arp_announce
Jul 29 12:28:35 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 "/sbin/route del default; 
/sbin/route add default gw 192.168.1.1"
Jul 29 12:28:40 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): bash: /sbin/route del default; /sbin/route add 
default gw 192.168.1.1: No such file or directory
Jul 29 12:28:40 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): ip addr del 192.168.1.240/32 dev lo
Jul 29 12:28:40 grp-01-00-50 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): RTNETLINK answers: Cannot assign requested address




Here is the log from the BACKUP director with FW = UP:

Jul 29 12:28:32 grp-01-00-51 Keepalived: Starting Keepalived v1.1.13 
(03/26,2007)
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Using MII-BMSR 
NIC polling thread...
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Netlink 
reflector reports IP 192.168.1.151 added
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Registering 
Kernel netlink reflector
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Registering 
Kernel netlink command channel
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Configuration is 
using : 20835 Bytes
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:22]
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:22]
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:80]
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:80]
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.200:443]
Jul 29 12:28:32 grp-01-00-51 Keepalived_healthcheckers: Activating 
healtchecker for service [192.168.1.201:443]
Jul 29 12:28:32 grp-01-00-51 Keepalived: Starting Healthcheck child 
process, pid=29721
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Using MII-BMSR NIC polling 
thread...
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Netlink reflector reports 
IP 192.168.1.151 added
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Registering Kernel netlink 
reflector
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Registering Kernel netlink 
command channel
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Registering gratutious ARP 
shared channel
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: Configuration is using : 
35690 Bytes
Jul 29 12:28:32 grp-01-00-51 Keepalived: Starting VRRP child process, 
pid=29722
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
Entering BACKUP STATE
Jul 29 12:28:32 grp-01-00-51 Keepalived_vrrp: VRRP sockpool: 
[ifindex(2), proto(112), fd(8,9)]
Jul 29 12:28:33 grp-01-00-51 kernel: IPVS: sync thread started: state = 
BACKUP, mcast_ifn = eth0, syncid = 25
Jul 29 12:28:33 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): ip addr add 192.168.1.240/32 dev lo brd + scope host
(everything above here is the same FW up or down; with FW down this is 
last entry and everything is ok; below starts when FW is up.)
Jul 29 12:28:39 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
Transition to MASTER STATE <----- BAD, after 6 secs we transit to MASTER
Jul 29 12:28:41 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
Entering MASTER STATE
Jul 29 12:28:41 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
setting protocol VIPs.
Jul 29 12:28:41 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
Sending gratuitous ARPs on eth0 for 192.168.1.240
Jul 29 12:28:41 grp-01-00-51 Keepalived_vrrp: Netlink: skipping nl_cmd 
msg...
Jul 29 12:28:41 grp-01-00-51 Keepalived_healthcheckers: Netlink 
reflector reports IP 192.168.1.240 added
Jul 29 12:28:41 grp-01-00-51 avahi-daemon[2086]: Registering new address 
record for 192.168.1.240 on eth0.IPv4.
Jul 29 12:28:41 grp-01-00-51 kernel: IPVS: stopping sync thread 29727 ...
Jul 29 12:28:41 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 /sbin/ip addr add 
192.168.1.240/32 dev lo brd + scope host
Jul 29 12:28:42 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): RTNETLINK answers: File exists
Jul 29 12:28:42 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 echo "1" > 
/proc/sys/net/ipv4/conf/eth0/arp_ignore
Jul 29 12:28:42 grp-01-00-51 kernel: IPVS: sync thread stopped!
Jul 29 12:28:42 grp-01-00-51 kernel: IPVS: sync thread started: state = 
MASTER, mcast_ifn = eth0, syncid = 25
Jul 29 12:28:42 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 echo "2" > 
/proc/sys/net/ipv4/conf/eth0/arp_announce
Jul 29 12:28:43 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.200 "/sbin/route del default; 
/sbin/route add default gw 192.168.1.1"
Jul 29 12:28:46 grp-01-00-51 Keepalived_vrrp: VRRP_Instance(VI_1) 
Sending gratuitous ARPs on eth0 for 192.168.1.240
Jul 29 12:28:48 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): bash: /sbin/route del default; /sbin/route add 
default gw 192.168.1.1: No such file or directory
Jul 29 12:28:48 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 /sbin/ip addr add 
192.168.1.240/32 dev lo brd + scope host
Jul 29 12:28:48 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): RTNETLINK answers: File exists
Jul 29 12:28:48 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 echo "1" > 
/proc/sys/net/ipv4/conf/eth0/arp_ignore
Jul 29 12:28:49 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 echo "2" > 
/proc/sys/net/ipv4/conf/eth0/arp_announce
Jul 29 12:28:49 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): rsh 192.168.1.201 "/sbin/route del default; 
/sbin/route add default gw 192.168.1.1"
Jul 29 12:28:54 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): bash: /sbin/route del default; /sbin/route add 
default gw 192.168.1.1: No such file or directory
Jul 29 12:28:54 grp-01-00-51 root: /etc/keepalived/manage_ip_lvs_dr 
(caller: keepalived): ip addr del 192.168.1.240/32 dev lo


ip addr show (MASTER DIRECTOR)
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
qlen 1000
link/ether 00:0c:29:a7:c7:33 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.150/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.240/24 scope global secondary eth0 <----------- listening 
= good
inet6 fe80::20c:29ff:fea7:c733/64 scope link
valid_lft forever preferred_lft forever


ip addr show (BACKUP DIRECTOR)
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
qlen 1000
link/ether 00:0c:29:54:ef:09 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.151/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.240/24 scope global secondary eth0 <----------- listening 
= bad, this is due to also being in MASTER state
inet6 fe80::20c:29ff:fe54:ef09/64 scope link
valid_lft forever preferred_lft forever


ip addr show (REAL SERVER 200)
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet 192.168.1.240/32 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
qlen 1000
link/ether 00:0c:29:49:e2:af brd ff:ff:ff:ff:ff:ff
inet 192.168.1.200/24 brd 192.168.1.255 scope global eth0
inet6 fe80::20c:29ff:fe49:e2af/64 scope link
valid_lft forever preferred_lft forever


ip addr show (REAL SERVER 201)
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet 192.168.1.240/32 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
qlen 1000
link/ether 00:0c:29:6e:c4:05 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.201/24 brd 192.168.1.255 scope global eth0
inet6 fe80::20c:29ff:fe6e:c405/64 scope link
valid_lft forever preferred_lft forever


iptables: MASTER and BACKUP DIRECTORS:
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 RH-Firewall-1-INPUT 0 -- 0.0.0.0/0 0.0.0.0/0

Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT 0 -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT 0 -- 224.0.0.0/8 0.0.0.0/0
2 ACCEPT 0 -- 0.0.0.0/0 224.0.0.0/8

Chain RH-Firewall-1-INPUT (1 references)
num target prot opt source destination
1 ACCEPT 0 -- 0.0.0.0/0 0.0.0.0/0
2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 icmp type 255
3 ACCEPT esp -- 0.0.0.0/0 0.0.0.0/0
4 ACCEPT ah -- 0.0.0.0/0 0.0.0.0/0
5 ACCEPT udp -- 0.0.0.0/0 224.0.0.251 udp dpt:5353
6 ACCEPT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpt:631
7 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:631
8 ACCEPT 0 -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
9 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
10 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:443
11 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:80
12 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpts:1010:1023
13 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:904
14 REJECT 0 -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
15 ACCEPT 0 -- 224.0.0.0/8 0.0.0.0/0
16 ACCEPT 0 -- 0.0.0.0/0 224.0.0.0/8


Again, when director firewalls are down everything works great; when 
they are up we get split brain.

Gerry




<Prev in Thread] Current Thread [Next in Thread>