On Sat, Jan 20, 2007 at 05:57:38AM -0700, Kenny Dail wrote:
> Hello happy list,
>
> Have a heartbeat + ldirectord setup spanning several IPs. From
> hareresources:
> cerberus ip1/24/eth0 ip2/24/eth0 ip3/24/eth0 ip4/24/eth0 ip5/24/eth0
> ip6/24/eth1 ip7/24/eth1 ldirectord
>
> one line is all I have, slightly edited and wrapped here.
>
> ha.cf is pretty simple:
> logfacility local0
> bcast eth1
> node hydra cerberus
>
>
> ldirectord has a quite huge ldirector.cf in /etc/ha.d/ and it is all
> working just fine on the main node cerberus. The secondary node hydra
> has undergone some software updates. Heartbeat failover works in that it
> detects when cerberus dies, and takes over the network interfaces.
> However it starts the interfaces and ldirectord and things work for a
> few seconds, then it tries to start it all again, ldirectord complains
> it is already running, and heartbeat bails.
>
> So what do I have set up wrong?
Nothing, its a bug :(
The problem is that older versions of ldirectord did not report their
status correctly. This causes heartbeat to think ldirectord isn't
running when it is, and worse ldirectord to throw an error when it is
started for a second time :(
The good news is that it is fixed by the following patch.
As this is a diff against 1.77.2.8, and you have 1.77.2.5,
it should apply without too much hassle.
http://cvs.linux-ha.org/viewcvs/viewcvs.cgi/linux-ha/ldirectord/Attic/ldirectord?rev=1.77.2.9&only_with_tag=STABLE_1_2&view=markup
If not, you might want to consider upgrading ldirectord.
http://www.vergenet.net/linux/ldirectord/download.shtml
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
|