Problems with keepalived

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	Problems with keepalived
From:	Dr A V Le Blanc <LeBlanc@xxxxxxxxx>
Date:	Wed, 28 Jun 2006 10:22:28 +0100

For many years we have used keepalived to manage our lvs farm,
but I have never managed to get it to failover automatically,
so I've simply used manual failover.  I have tried a large number
of minor variations on configuration files, but let me give a
simple example.  There are two directors.  One has this in its
/etc/keepalived/keepalived.conf file:

vrrp_sync_group VG1 {
  group {
    VI_1
  }
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        130.88.203.138
        130.88.203.219
    }
}

and the other has this:

vrrp_sync_group VG1 {
  group {
    VI_1
  }
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 50
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        130.88.203.138
        130.88.203.219
    }
}

Now the master starts up perfectly happily, but the file
/var/log/messages has thousands of errors like this:

Jun 28 09:18:32 rust Keepalived_vrrp: receive an invalid ip number count 
associated with VRID!
Jun 28 09:18:32 rust Keepalived_vrrp: bogus VRRP packet received on eth0 !!!
Jun 28 09:18:32 rust Keepalived_vrrp: VRRP_Instance(VI_1) Dropping received 
VRRP packet...

The backup director starts up, but doesn't listen on the virtual addresses
at all.  Its /var/log/messages has thousands of errors like this:

Jun 28 06:25:05 stye Keepalived_vrrp: receive an invalid ip number count 
associated with VRID!
Jun 28 06:25:05 stye Keepalived_vrrp: bogus VRRP packet received on eth0 !!!
Jun 28 06:25:05 stye Keepalived_vrrp: VRRP_Instance(VI_1) ignoring received 
advertisment...

What confuses me is that the messages in /var/log/messages appear when
keepalived is running on that machine, and seem to have nothing to do
with the other keepalived; that is, there doesn't seem to be any
communication between the two directors at all.  I have read the
User Guide and searched for any messages on this subject, and I have
to say that there's a lot I don't really understand.

(1) I've set up various email addresses in global_defs, but no email
   has ever been sent to any of them, as far as I can tell.

(2)  It's not clear what lvs_id is supposed to be.  I've been using
   the host name.

(3)  I'm trying to setup a simple failover, and all the example
   configs seem to be for more complex cases.  I don't really
   understand what exactly a vrrp_sync_group is and what a
   vrrp_instance is.  For example, should the two director machines
   have the same name for the two vrrp_instances that are supposed
   to manage the same address, or should they have different names?

(4)  Some of the documents seem to say that two directors should
   have different values for virtual_router_id, but some of the
   examples seem to use the same value on different directors.

(5)  There are two types of authentication, AH and PASS, but it's
   not clear what the advantages are.  I presume that AH is more
   secure, but that PASS is less likely to go wrong.

(6)  Some examples have an advert_int in some configurations; I'm
   not sure what this does, whether it's needed or not.

I would appreciate any advice or help.  As I said, I have read
the User Guide without enlightenment.  One suggestion, to start
with -d or -D as an argument, produced no information.  I am
using version 1.1.11-3 on Debian sarge.

     -- Owen
     LeBlanc <at> man <dot> ac <dot> uk

<Prev in Thread]	Current Thread	[Next in Thread>
Problems with keepalived, Dr A V Le Blanc <= Re: Problems with keepalived, Dr A V Le Blanc Re: Problems with keepalived, Graeme Fowler Re: Problems with keepalived, Dr A V Le Blanc Re: Problems with keepalived, Graeme Fowler Re: Problems with keepalived, Brad Dameron Re: Problems with keepalived, Sébastien BONNET

Previous by Date:	Re: Realserver failover problem using ssl and tomcat, Horms
Next by Date:	Re: ldirector and heartbeat, John Gray
Previous by Thread:	Re: Problem with fallback 127.0.0.1:80, Jonathan Trott
Next by Thread:	Re: Problems with keepalived, Dr A V Le Blanc
Indexes:	[Date] [Thread] [Top] [All Lists]