LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: ldirectord stopped suddenly

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: ldirectord stopped suddenly
From: "Leon Keijser" <errtu@xxxxxxx>
Date: Tue, 6 Dec 2005 15:58:43 +0100 (MET)
Horms,

(i just address this to you, since you maintain ldirectord, but this goes to
all who feel helpful)

I've applied your patch (from a few days ago) to ldirectord 1.128 and found
some problems with it. First i thought i made a mistake in patching or
whatever, so i tried to use ldirectord-1.128 (unpatched), but this causes a
problem as well:

# using ldirectord 1.77.2.36 :
everything works perfectly

# using ldirectord 1.77.2.37 :
everything works perfectly

# using ldirectord 1.128 (unpatched) :
output from `ps -A` shows:
--
ipvs_syncmaster
ipvs_syncbackup
heartbeat
heartbeat
heartbeat
heartbeat
ip-request-resp
ResourceManager
ldirectord
--

The situation:

ldirectord gets started by heartbeat. It tries to set the VIP's, and
succeeds for the first resource. Then it simply stops trying. The second VIP
never gets up. See this piece of logfile:

Dec  6 15:41:06 rpzlvstest01 IPVS: sync thread started: state = MASTER,
mcast_ifn = eth1, syncid = 25
Dec  6 15:41:06 rpzlvstest01 IPVS: sync thread started: state = BACKUP,
mcast_ifn = eth1, syncid = 26
Dec  6 15:41:36 rpzlvstest01 heartbeat[20773]: info:
**************************
Dec  6 15:41:36 rpzlvstest01 heartbeat[20773]: info: Configuration
validated. Starting heartbeat 1.2.3
Dec  6 15:41:36 rpzlvstest01 heartbeat[20774]: info: heartbeat: version
1.2.3
Dec  6 15:41:36 rpzlvstest01 heartbeat[20774]: info: Heartbeat generation:
153
Dec  6 15:41:36 rpzlvstest01 heartbeat[20774]: info: UDP Broadcast heartbeat
started on port 695 (695) interface eth1
Dec  6 15:41:36 rpzlvstest01 heartbeat[20774]: info: pid 20774 locked in
memory.
Dec  6 15:41:36 rpzlvstest01 heartbeat[20774]: info: Local status now set
to: 'up'
Dec  6 15:41:37 rpzlvstest01 heartbeat[20784]: info: pid 20784 locked in
memory.
Dec  6 15:41:37 rpzlvstest01 heartbeat[20783]: info: pid 20783 locked in
memory.
Dec  6 15:41:37 rpzlvstest01 heartbeat[20785]: info: pid 20785 locked in
memory.
Dec  6 15:41:37 rpzlvstest01 heartbeat[20774]: info: Link rpzlvstest01:eth1
up.
Dec  6 15:42:30 rpzlvstest01 heartbeat[20774]: info: Clock jumped backwards.
Compensating.
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: info: Link rpzlvstest02:eth1
up.
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: info: Status update for node
rpzlvstest02: status up
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: info: Local status now set
to: 'active'
Dec  6 15:42:58 rpzlvstest01 heartbeat[20858]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: info: Status update for node
rpzlvstest02: status active
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: debug:
StartNextRemoteRscReq(): child count 1
Dec  6 15:42:58 rpzlvstest01 heartbeat: info: Running /etc/ha.d/rc.d/status
status
Dec  6 15:42:58 rpzlvstest01 heartbeat[20862]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Dec  6 15:42:58 rpzlvstest01 heartbeat: info: Running /etc/ha.d/rc.d/status
status
Dec  6 15:42:58 rpzlvstest01 heartbeat[20774]: info: Clock jumped backwards.
Compensating.
Dec  6 15:42:59 rpzlvstest01 heartbeat[20774]: info: Clock jumped backwards.
Compensating.
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: info: local resource
transition completed.
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: info: Initial resource
acquisition complete (T_RESOURCES(us))
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: info: remote resource
transition completed.
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: debug:
StartNextRemoteRscReq(): child count 1
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: debug:
StartNextRemoteRscReq(): child count 1
Dec  6 15:43:08 rpzlvstest01 heartbeat[20774]: debug:
StartNextRemoteRscReq(): child count 1
Dec  6 15:43:08 rpzlvstest01 heartbeat[20866]: info: Local Resource
acquisition completed.
Dec  6 15:43:08 rpzlvstest01 heartbeat[20949]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Dec  6 15:43:08 rpzlvstest01 heartbeat: info: Running
/etc/ha.d/rc.d/ip-request-resp ip-request-resp
Dec  6 15:43:08 rpzlvstest01 heartbeat: received ip-request-resp
192.168.51.210 OK yes
Dec  6 15:43:09 rpzlvstest01 heartbeat: info: Acquiring resource group:
rpzlvstest01 192.168.51.210 ldirectord
Dec  6 15:43:09 rpzlvstest01 heartbeat: info: Running
/etc/ha.d/resource.d/IPaddr 192.168.51.210 start
Dec  6 15:43:09 rpzlvstest01 heartbeat: debug: Starting
/etc/ha.d/resource.d/IPaddr 192.168.51.210 start
Dec  6 15:43:09 rpzlvstest01 heartbeat: info: /sbin/ifconfig eth0:0
192.168.51.210 netmask 255.255.248.0        broadcast 192.168.55.255
Dec  6 15:43:09 rpzlvstest01 heartbeat: info: Sending Gratuitous Arp for
192.168.51.210 on eth0:0 [eth0]
Dec  6 15:43:09 rpzlvstest01 heartbeat: /usr/lib/heartbeat/send_arp -i 1010
-r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-192.168.51.210 eth0
192.168.51.210 auto 
92.168.51.210 ffffffffffff
Dec  6 15:43:09 rpzlvstest01 heartbeat: debug: /etc/ha.d/resource.d/IPaddr
192.168.51.210 start done. RC=0
Dec  6 15:43:09 rpzlvstest01 heartbeat[20774]: info: Clock jumped backwards.
Compensating.
Dec  6 15:43:10 rpzlvstest01 heartbeat: info: Running
/etc/ha.d/resource.d/ldirectord  start
Dec  6 15:43:10 rpzlvstest01 heartbeat: debug: Starting
/etc/ha.d/resource.d/ldirectord  start


These same problems occurred when i patched ldirectord 1.128 using the patch
you emailed.


Léon

-- 
Lust, ein paar Euro nebenbei zu verdienen? Ohne Kosten, ohne Risiko!
Satte Provisionen für GMX Partner: http://www.gmx.net/de/go/partner

<Prev in Thread] Current Thread [Next in Thread>