LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Keepalived vrrp problem

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Keepalived vrrp problem
From: Sal Tepedino <stepedino@xxxxxxxxxxxxxx>
Date: Thu, 15 Feb 2007 17:43:37 -0500
Alright... this is driving me nuts and countless searches have turned up
tons of help, but no solution. 

Here's the situation: I have 2 directors set up as localnode (No actual
realservers). That all seems to work perfectly. 

The problem is when keepalived starts up, both directors start in MASTER
(as expected), then the backup (lower priority) falls back to 'BACKUP'.
All is well. Then a few seconds later, the backup suddenly decides to
switch to master, sends out an GARP and starts advertising VRRP packets.
The real master sees this, forces a new election and sends out a GARP.
the backup switches back to backup state and all is well for a few more
seconds. Then it all starts over again...

On the master:
Feb 15 17:21:26 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) Received lower 
prio advert, forcing new election
Feb 15 17:21:26 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) IPSEC-AH : 
Syncing seq_num - Increment seq
Feb 15 17:21:26 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) Sending 
gratuitous ARPs on eth1 for 10.1.1.110
Feb 15 17:21:27 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) Received lower 
prio advert, forcing new election
Feb 15 17:21:27 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) IPSEC-AH : 
Syncing seq_num - Increment seq
Feb 15 17:21:27 pd-lvs01 Keepalived_vrrp: VRRP_Instance(VI_1) Sending 
gratuitous ARPs on eth1 for 10.1.1.110

. . . Repeat ad nauseum.

On the backup:
Feb 15 17:21:49 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Transition to 
MASTER STATE
Feb 15 17:21:49 pd-lvs02 Keepalived_vrrp: VRRP_Group(VG1) Syncing instances to 
MASTER state
Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER 
STATE
Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) setting protocol 
VIPs.
Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Sending 
gratuitous ARPs on eth1 for 10.1.1.110
Feb 15 17:21:50 pd-lvs02 Keepalived_healthcheckers: Netlink reflector reports 
IP 10.1.1.110 added
Feb 15 17:21:50 pd-lvs02 Keepalived_healthcheckers: Activating healtchecker for 
service [10.1.1.111:22]
Feb 15 17:21:50 pd-lvs02 Keepalived_healthcheckers: Activating healtchecker for 
service [10.1.1.112:22]
Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: Remote SMTP server [10.0.0.18:25] 
connected.Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: Netlink reflector reports 
IP 10.1.1.110 added
Feb 15 17:21:50 pd-lvs02 Keepalived_vrrp: SMTP alert successfully sent.
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Sending 
gratuitous ARPs on eth1 for 10.1.1.110
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Received higher 
prio advert
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) IPSEC-AH : 
Syncing seq_num - Decrement seq
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) Entering BACKUP 
STATE
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Instance(VI_1) removing protocol 
VIPs.
Feb 15 17:21:51 pd-lvs02 Keepalived_healthcheckers: Netlink reflector reports 
IP 10.1.1.110 removed
Feb 15 17:21:51 pd-lvs02 Keepalived_healthcheckers: Suspending healtchecker for 
service [10.1.1.111:22]
Feb 15 17:21:51 pd-lvs02 Keepalived_healthcheckers: Suspending healtchecker for 
service [10.1.1.112:22]
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: VRRP_Group(VG1) Syncing instances to 
BACKUP state
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: Remote SMTP server [10.0.0.18:25] 
connected.Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: Netlink reflector reports 
IP 10.1.1.110 removed
Feb 15 17:21:51 pd-lvs02 Keepalived_vrrp: SMTP alert successfully sent.

. . . Shower rinse repeat.


Master's config:

global_defs {
   notification_email {
     admin@xxxxxxxxxxxxx
   }
   notification_email_from keepalived@pd-perim01
   smtp_server 10.0.0.18
   smtp_connect_timeout 30
   router_id pd-perim01
}

vrrp_sync_group VG1 {
         group {
                VI_1
         }
}

vrrp_instance VI_1 {
    state MASTER
    interface eth1
    lvs_sync_daemon_inteface eth1
    virtual_router_id 51
    priority 150
    advert_int 1
    garp_master_delay 1
    smtp_alert
    authentication {
        auth_type AH
        auth_pass likOkeam
    }
    virtual_ipaddress {
        10.1.1.110/24
    }
    notify_master "/usr/local/bin/keepalived-transition.sh MASTER"
    notify_backup "/usr/local/bin/keepalived-transition.sh BACKUP"
    notify_fault "/usr/local/bin/keepalived-transition.sh FAULT"
    notify_stop "/usr/local/bin/keepalived-transition.sh FAULT"
}

And the backup:
global_defs {
   notification_email {
     admin@xxxxxxxxxxxxx
   }
   notification_email_from keepalived@pd-perim02
   smtp_server 10.0.0.18
   smtp_connect_timeout 30
   router_id pd-perim02
}

vrrp_sync_group VG1 {
         group {
                VI_1
         }
}

vrrp_instance VI_1 {
    state MASTER
    interface eth1
    lvs_sync_daemon_inteface eth1
    virtual_router_id 51
    priority 100
    advert_int 1
    garp_master_delay 1
    smtp_alert
    authentication {
        auth_type AH
        auth_pass likOkeam
    }
    virtual_ipaddress {
        10.1.1.110/24
    }
    notify_master "/usr/local/bin/keepalived-transition.sh MASTER"
    notify_backup "/usr/local/bin/keepalived-transition.sh BACKUP"
    notify_fault "/usr/local/bin/keepalived-transition.sh FAULT"
    notify_stop "/usr/local/bin/keepalived-transition.sh FAULT"
}


The notify script is just:
#!/bin/bash

case $1 in
        MASTER  ) ip addr del 10.1.1.110/24 dev lo ;;
        BACKUP  ) ip addr del 10.1.1.110/24 dev eth1 
                ip addr add 10.1.1.110/24 dev lo ;;
        FAULT   ) ip addr del 10.1.1.110/24 dev lo
                ip addr del 10.1.1.110/24 dev eth1 ;;
esac

Some mcast traffic back and forth:
Master:
17:42:23.528439 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x11): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:24.528923 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x12): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:24.530162 IP 10.1.1.112 > vrrp.mcast.net: AH(spi=0x0a010170,seq=0x10): 
VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20
17:42:24.530273 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x12): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:25.530892 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x13): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20

Backup:
17:42:23.528921 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x11): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:24.529458 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x12): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:24.530550 IP 10.1.1.112 > vrrp.mcast.net: AH(spi=0x0a010170,seq=0x10): 
VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20
17:42:24.530710 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x12): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20
17:42:25.531370 IP 10.1.1.111 > vrrp.mcast.net: AH(spi=0x0a01016f,seq=0x13): 
VRRPv2, Advertisement, vrid 51, prio 150, authtype ah, intvl 1s, length 20


Any ideas? I'm stumped. I tried changing just about everything. I'm out of 
ideas.

-- 
Sal Tepedino <stepedino@xxxxxxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>