Re: Fail Over

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	Re: Fail Over
From:	Horms <horms@xxxxxxxxxxxx>
Date:	Thu, 5 Jun 2003 21:36:47 +0900

On Wed, Jun 04, 2003 at 11:42:41AM -0500, AJ Lemke wrote:
> Hello List,
> 
> I am running a 2 Node Cluster with fail over using Heartbeat.  We
> recently have come to notice that when the Primary Node(Director1) is
> taken down or fails the Secondary Node(Director2) takes upto 6 minutes
> to assume the Virtual IP's.  Sometimes the Director2 doesn't take over
> at all.  Heartbeat checks the servers every 2 seconds and the Deadtime
> is 10 seconds.  If I restart the heartbeat service on both Nodes they
> seem to work within  15 seconds the first couple of tries but then they
> seem to get confused as Director2 will not give up its resources when
> Director1 comes back on line.  This is tested by shutting off the port
> on the switch or by starting and stopping the Heartbeat service.  Any
> ideas as to what could be causing this problem?

That is very strange. Which version of heartbeat are you using?

As always, heartbeat related questions are best asked
on the linux-ha or linux-ha-dev mailing lists.
Information on these can be found on www.linux-ha.org.

-- 
Horms

<Prev in Thread]	Current Thread	[Next in Thread>
Fail Over, AJ Lemke Re: Fail Over, Markus Markert RE: Fail Over, AJ Lemke Re: Fail Over, Horms <= RE: Fail Over, AJ Lemke Re: Fail Over, Horms RE: Fail Over, AJ Lemke RE: Fail Over, AJ Lemke Re: Fail Over, Horms

Previous by Date:	Re: is the LVS hash function susceptible to DoS?, Joseph Mack
Next by Date:	RE: Fail Over, AJ Lemke
Previous by Thread:	RE: Fail Over, AJ Lemke
Next by Thread:	RE: Fail Over, AJ Lemke
Indexes:	[Date] [Thread] [Top] [All Lists]