LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

heartbeat-2.0.7 crm restarting ldirectord constantly :(

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: heartbeat-2.0.7 crm restarting ldirectord constantly :(
From: "Klavs Klavsen" <kl@xxxxxxx>
Date: Mon, 23 Oct 2006 16:22:25 +0200 (CEST)
Hi guys,

I'm trying to get my heart v1 config working with heartbeat 2.0.7 with the
"crm on".

When I do this, it starts ldirectord, but then promptly stops it, and
starts it again (according to the ldirectord.log - as you can see below).

The output from the heartbeat d's says its "not running" - even though I
can see it actually is - and that heartbeat is spawning a ldirectord stop
(right after it calls status, seen with ps, - which is whats stopping it).

I'm thinking perhaps, there's a bug in the ldirectord status output?

HEARTBEAT ha-log
crmd[21037]: 2006/07/13_16:12:28 info: process_lrm_event:lrm.c LRM
operation (10) start_0 on ldirectord_2 complete
cib[21033]: 2006/07/13_16:12:28 info: cib_diff_notify:notify.c Update
(client: 21037, call:30): 0.4.828 -> 0.4.829 (ok)
tengine[21045]: 2006/07/13_16:12:28 info: te_update_diff:callbacks.c
Processing diff (cib_update): 0.4.828 -> 0.4.829
tengine[21045]: 2006/07/13_16:12:28 info: match_graph_event:events.c
Action ldirectord_2_start_0 (8) confirmed
tengine[21045]: 2006/07/13_16:12:28 info: te_pseudo_action:actions.c
Pseudo action 10 confirmed
tengine[21045]: 2006/07/13_16:12:28 info: send_rsc_command:actions.c
Initiating action 2: ldirectord_2_monitor_120000 on linuxvs02
crmd[21037]: 2006/07/13_16:12:28 info: do_lrm_rsc_op:lrm.c Performing op
monitor on ldirectord_2 (interval=120000ms, key=2:ca77ec03-83b8-4266-95f3
-22c343187abb)
cib[21160]: 2006/07/13_16:12:28 info: write_cib_contents:io.c Wrote
version 0.4.829 of the CIB to disk (digest:
34a1b51c3718c83a94d685f7a4ab69a2)
lrmd[21034]: 2006/07/13_16:12:28 ERROR: RA heartbeat:ldirectord_2:monitor
(process 21161) failed to redirect stderr for its background child (daem
on) processes. This will likely cause those processes to die mysteriously
at some later time (terminated by signal SIGPIPE).
lrmd[21034]: 2006/07/13_16:12:28 info: RA output:
(ldirectord_2:monitor:stderr) ldirectord for
/etc/ha.d/conf/http.www.mitsite.dk.cf is running wi
th pid: 21151

lrmd[21034]: 2006/07/13_12:12:29 WARN: G_SIG_dispatch: Dispatch function
for SIGCHLD was delayed 1000 ms (> 100 ms) before being called (GSource:
0x517b68)
lrmd[21034]: 2006/07/13_12:12:29 info: G_SIG_dispatch: started at
1720654161 should have started at 1720654061
crmd[21037]: 2006/07/13_12:12:29 WARN: process_lrm_event:lrm.c LRM
operation (11) monitor_120000 on ldirectord_2 Error: (7) not running
lrmd[21034]: 2006/07/13_12:12:29 WARN: There is something wrong: the first
line isn't read in. Maybe the heartbeat does not ouput string correctly
 for status operation. Or the code (myself) is wrong.
cib[21033]: 2006/07/13_12:12:29 info: cib_diff_notify:notify.c Update
(client: 21037, call:31): 0.4.829 -> 0.4.830 (ok)
tengine[21045]: 2006/07/13_12:12:29 info: te_update_diff:callbacks.c
Processing diff (cib_update): 0.4.829 -> 0.4.830
tengine[21045]: 2006/07/13_12:12:29 ERROR: match_graph_event:events.c
Action ldirectord_2_monitor_120000 on linuxvs02 failed (target: 0 vs. rc:
7)
: Error
tengine[21045]: 2006/07/13_12:12:29 WARN: update_failcount:events.c
Updating failcount for ldirectord_2 on linuxvs02 after failed monitor:
rc=7
tengine[21045]: 2006/07/13_12:12:29 info: update_abort_priority:utils.c
Abort priority upgraded to 1
tengine[21045]: 2006/07/13_12:12:29 info: update_abort_priority:utils.c
Abort action 0 superceeded by 2
tengine[21045]: 2006/07/13_12:12:29 info: match_graph_event:events.c
Action ldirectord_2_monitor_120000 (2) confirmed


LDIRECTORD.LOG
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21149] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf start
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21149] Starting Linux
Director v1.143 as daemon
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21151] Added virtual
server: 84.32.32.27:80
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21151] Added real server:
84.32.32.20:80 ( x 84.32.32.27:80) (Weight set to 4)
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21151] Added real server:
84.32.32.21:80 ( x 84.32.32.27:80) (Weight set to 4)
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21321] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf status
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21321] ldirectord for
/etc/ha.d/conf/http.www.mitsite.dk.cf is running with pid: 21151
[Thu Jul 13 12:12:28 2006|http.www.mitsite.dk.cf|21321] Exiting from
ldirectord status
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|2132] Invoking ldirectord
invoked as: /etc/ha.d/resource.d//ldirectord conf/http.www.mitsite.dk.cf
stop
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21151] Removed real
server (stop): 84.32.32.20:80 ( x 84.32.32.27:80)
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21151] Removed real
server (stop): 84.32.32.21:80 ( x 84.32.32.27:80)
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21151] Removed virtual
server (stop): 84.32.32.27:80
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21151] Linux Director
Daemon terminated on signal: TERM
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21172] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf start
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21172] Starting Linux
Director v1.143 as daemon
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21174] Added virtual
server: 84.32.32.27:80
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21174] Added real server:
84.32.32.20:80 ( x 84.32.32.27:80) (Weight set to 4)
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21174] Added real server:
84.32.32.21:80 ( x 84.32.32.27:80) (Weight set to 4)
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21183] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf status
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21183] ldirectord for
/etc/ha.d/conf/http.www.mitsite.dk.cf is running with pid: 21174
[Thu Jul 13 12:12:30 2006|http.www.mitsite.dk.cf|21183] Exiting from
ldirectord status
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21199] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf stop
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21174] Removed real
server (stop): 84.32.32.20:80 ( x 84.32.32.27:80)
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21174] Removed real
server (stop): 84.32.32.21:80 ( x 84.32.32.27:80)
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21174] Removed virtual
server (stop): 84.32.32.27:80
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21174] Linux Director
Daemon terminated on signal: TERM
[Thu Jul 13 12:12:31 2006|http.www.mitsite.dk.cf|21204] Invoking
ldirectord invoked as: /etc/ha.d/resource.d//ldirectord
conf/http.www.mitsite.dk.cf start


-- 
Regards,
Klavs Klavsen, GSEC - kl@xxxxxxx - http://www.vsen.dk
PGP: 7E063C62/2873 188C 968E 600D D8F8  B8DA 3D3A 0B79 7E06 3C62

"Those who do not understand Unix are condemned to reinvent it, poorly."
  --Henry Spencer


<Prev in Thread] Current Thread [Next in Thread>