LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

heartbeat trying to start ldirectord twice

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: heartbeat trying to start ldirectord twice
From: Kenny Dail <kend@xxxxxxxxx>
Date: Sat, 20 Jan 2007 05:57:38 -0700
Hello happy list,

Have a heartbeat + ldirectord setup spanning several IPs. From
hareresources:
cerberus        ip1/24/eth0 ip2/24/eth0 ip3/24/eth0 ip4/24/eth0 ip5/24/eth0
ip6/24/eth1 ip7/24/eth1 ldirectord

one line is all I have, slightly edited and wrapped here.

ha.cf is pretty simple:
logfacility     local0
bcast eth1
node    hydra cerberus


ldirectord has a quite huge ldirector.cf in /etc/ha.d/ and it is all
working just fine on the main node cerberus. The secondary node hydra
has undergone some software updates. Heartbeat failover works in that it
detects when cerberus dies, and takes over the network interfaces.
However it starts the interfaces and ldirectord and things work for a
few seconds, then it tries to start it all again, ldirectord complains
it is already running, and heartbeat bails. 

So what do I have set up wrong?

This is logged in messages:
Jan 20 04:49:02 hydra heartbeat: [14686]: info: Status update for node 
cerberus: status active
Jan 20 04:49:02 hydra heartbeat: [14696]: debug: notify_world: setting SIGCHLD 
Handler to SIG_DFL
Jan 20 04:49:03 hydra harc[14696]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:49:03 hydra heartbeat: [14686]: info: Link hydra:eth1 up.
Jan 20 04:49:58 hydra heartbeat: [14686]: info: Received shutdown notice from 
'cerberus'.
Jan 20 04:49:58 hydra heartbeat: [14686]: info: Resources being acquired from 
cerberus.
Jan 20 04:49:58 hydra heartbeat: [14706]: debug: notify_world: setting SIGCHLD 
Handler to SIG_DFL
Jan 20 04:49:59 hydra harc[14706]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:49:59 hydra heartbeat: [14707]: info: No local resources 
[/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:49:59 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child 
count 1
Jan 20 04:49:59 hydra mach_down[14719]: info: Taking over resource group 
ip1/24/eth0
Jan 20 04:49:59 hydra ResourceManager[14746]: info: Acquiring resource 
group:<snip>
[many lines cut concerning the start of IPaddr for each ip]
Jan 20 04:50:11 hydra ResourceManager[14746]: info: Running 
/etc/ha.d/resource.d/ldirectord start
Jan 20 04:50:11 hydra ResourceManager[14746]: debug: Starting 
/etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:13 hydra ldirectord[16829]: Starting Linux Director v1.77.2.5 as 
daemon
Jan 20 04:50:13 hydra ResourceManager[14746]: debug: 
/etc/ha.d/resource.d/ldirectord  start done. RC=0
Jan 20 04:50:13 hydra mach_down[14719]: info: mach_down takeover complete for 
node cerberus.
[21 lines: ldirectord[16831]: Added virtual server: xxx]
[3 lines: ldirectord[16831]: Added fallback server: xxx]
[40 lines: ldirectord[16831]: Quiescent real server: xxx]
[20 lines: ldirectord[16831]: Restored real server: xxx]
Jan 20 04:50:29 hydra heartbeat: [14686]: WARN: node cerberus: is dead
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Dead node cerberus gave up 
resources.
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Resources being acquired from 
cerberus.
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Link cerberus:eth1 dead.
Jan 20 04:50:29 hydra heartbeat: [17044]: debug: notify_world: setting SIGCHLD 
Handler to SIG_DFL
Jan 20 04:50:29 hydra harc[17044]: info: Running /etc/ha.d/rc.d/status status
Jan 20 04:50:29 hydra heartbeat: [17045]: info: No local resources 
[/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:50:29 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child 
count 1
Jan 20 04:50:29 hydra mach_down[17064]: info: Taking over resource group 
ip1/24/eth0
Jan 20 04:50:29 hydra ResourceManager[17084]: info: Acquiring resource group: 
<snip>
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Comm_now_up(): updating status 
to active
Jan 20 04:50:29 hydra heartbeat: [14686]: info: Local status now set to: 
'active'
Jan 20 04:50:30 hydra heartbeat: [17108]: info: No local resources 
[/usr/lib/heartbeat/ResourceManager listkeys hydra] to acquire.
Jan 20 04:50:30 hydra heartbeat: [14686]: debug: StartNextRemoteRscReq(): child 
count 1
Jan 20 04:50:30 hydra IPaddr[17112]: INFO: IPaddr Running OK
Jan 20 04:50:31 hydra IPaddr[17224]: INFO: IPaddr Running OK
Jan 20 04:50:31 hydra IPaddr[17330]: INFO: IPaddr Running OK
Jan 20 04:50:32 hydra IPaddr[17436]: INFO: IPaddr Running OK
Jan 20 04:50:32 hydra IPaddr[17542]: INFO: IPaddr Running OK
Jan 20 04:50:33 hydra IPaddr[17654]: INFO: IPaddr Running OK
Jan 20 04:50:34 hydra IPaddr[17760]: INFO: IPaddr Running OK
Jan 20 04:50:35 hydra ResourceManager[17084]: info: Running 
/etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:35 hydra ResourceManager[17084]: debug: Starting 
/etc/ha.d/resource.d/ldirectord  start
Jan 20 04:50:37 hydra ResourceManager[17084]: debug: 
/etc/ha.d/resource.d/ldirectord  start done. RC=1
Jan 20 04:50:37 hydra ResourceManager[17084]: ERROR: Return code 1 from 
/etc/ha.d/resource.d/ldirectord
Jan 20 04:50:37 hydra ResourceManager[17084]: CRIT: Giving up resources due to 
failure of ldirectord

-- 
Kenny Dail <kend@xxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>
  • heartbeat trying to start ldirectord twice, Kenny Dail <=