LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: trouble of heartbeat 1.0.4

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: trouble of heartbeat 1.0.4
From: Horms <horms@xxxxxxxxxxxx>
Date: Mon, 21 Jun 2004 11:48:58 +0900
On Fri, Jun 18, 2004 at 12:02:37PM -0700, Peter Mueller wrote:
> > In my system. some problems can be seen in the condition of 
> > "nice_failback on", as mentioned below.
> 
> You've already noted the syntax change, but just wanted to stress it :).
> http://wiki.trick.ca/linux-ha/ha.cf/AutoFailbackDirective
> 
> > 1)eth0:0 comes up on the both Active and Backup Servers.
> 
> Sounds like they lost communication and went split-brain.  If you want
> to find out exactly what went wrong you should post your ha-logs and
> configs to the linux-ha list.  I'm positive there are split-brain
> scenarios that were fixed from 1.04 -> 1.2, so that's the option I would
> choose.
> 
> > 2)Gratuitous arp are not  broadcasted when failover.
> >   (I checked many times using tcpdump)
> 
> There are a few ARP bugs fixed in 1.2.  I know it broadcasts on my 1.2,
> but I'm pretty sure it did for my 1.04 too.  
> 
> > To confirm whether it is my mis-configuration, I had tested 
> > same config
> > files on my friend's  system (heartbeat version 1.2.0-4, 
> > auto_failback off). Actually, it works quite well.
> 
> 1.2 seems more stable to me in my usage.  I think Alan Robertson and
> Horms (builds Debian packages) stressed that this version was
> recommended.
> 
> > Are these problems are typical in heartbeat 1.0.4?
> > Is there any solution other than update heartbeat?
> > Should I update it at any cost?
> 
> Yes, there are a few problems with 1.0.4.  I don't know about any cost.
> But in your situation -- split brain and arp -- it would make sense to
> upgrade.  There was definitely at least one split-brain bugfix, and
> there were many others such as arp fixes.  I think if you look at the
> linux-ha changelog you will find the details.
> http://www.linux-ha.org/download/ChangeLog

Yes, there are some problems with 1.0.4. Though I am not entirely
sure what is causing the problem here. In any case if 1.2.X works
for you, go for it. 

-- 
Horms
<Prev in Thread] Current Thread [Next in Thread>