LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: effect of changing settings in ha.cf

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: effect of changing settings in ha.cf
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Thu, 19 Jan 2006 22:50:43 +0100
A very simple question, but one i am unable to find the answer to.

The linux-ha list might be a better place.

[intro] In a production environment i have 2 LVS's (heartbeat + ldirectord).
They're setup so that one is always master and the other the backup node.

A typical linux-ha setup.

In
ha.cf i've set auto_failback to 'on'. Now this turned out to be not so nice
a solution. Several times it occurred that the backup node thought the
master was dead and tried to do a failover.

Care to show some output of the halog?

During the failover it
recognized the master as actually not dead.

This sounds weird to me. Has the failover happened on the master (release of all resources) before this event?

This is when all the trouble
starts, because it looks like they're fighting over the 'master' status,

Who is fighting? The master and the backup? If so, this would be a temporary split-brain situation, but such a thing is only possible if either you have special fs related resources or a runnaway RT process. Could you post your HA configuration (all three config files please)?

resulting in all clients being disconnected from the terminal server due to
timeouts.

Do you have lvs sync enabled?

[question] In ha.cf i've changed auto_failback to 'off' to stop this from
happening.

I personally believe anyway that auto failback is a really bad thing in any HA environment, but I reckon Lars, Alan and other HA gurus might disagree with me.

I haven't restarted anything though, and there's not much
opportunity to do so, so i was wondering if these changes are taken in
effect immediately, or only after heartbeat restarts?

You lost me here. You mean the auto_failback setting? Restart heartbeat on the node that has no resources acquired (backup) and then do the same on the master node.

I'm afraid we need more information, but I also think from what I gathered from your email that the question is more suited for the linux-ha mailinglist.

Regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread] Current Thread [Next in Thread>