-------------------
Problem description
-------------------
I want to setup a 2 node HA cluster based on the Streamline High
Availability and
Load Balancing concept.
Unfortunately after having spent many hours I did not succeed in doing so.
The problem AFAIK is related to the fact that the second real server is
never
getting added to LVS:
[root@grind11 ~]# ipvsadm
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.31.1.10:http rr persistent 600
-> grind11.graddelt.com:http Local 1 0 0
I now would have expected a second entry in the output from lvsadm, e.g.:
-> grind12.graddelt.com:http Route 1 0 0
But it never ever shows up ... -(
--------
Topology
--------
The topology is based on the Streamline High Availability and Load
Balancing concept.
ROUTER (.1)
VIP = .10 |
------------------------------------------------ 172.31.1.0/24
eth1 | .11 eth1 | .12
------- -------
|grind11| (eth3)<--bcast-->(eth3) |grind12|
------- -------
eth0 | .11 eth0 | .12
------------------------------------------------ 10.1.156.0/24
The eth3 interfaces (cross-linked) are used for the bcast.
The 10.1.156.0/24 network is for management purposes only.
Both nodes (Grind11 and Grind 12) are running HTTPd only listening on the
172.31.1.0/24 network.
-------------
Configuration
-------------
[root@grind11 ha.d]# sysctl -a | grep arp | egrep "(ignore|annou)" | sort
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 0
net.ipv4.conf.default.arp_ignore = 0
net.ipv4.conf.eth0.arp_announce = 2
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.eth1.arp_announce = 2
net.ipv4.conf.eth1.arp_ignore = 1
net.ipv4.conf.eth3.arp_announce = 2
net.ipv4.conf.eth3.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 0
net.ipv4.conf.lo.arp_ignore = 0
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.eth0.forwarding = 1
net.ipv4.conf.eth1.forwarding = 1
net.ipv4.conf.eth3.forwarding = 1
net.ipv4.conf.lo.forwarding = 1
[root@grind11 ha.d]# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
172.31.1.11 grind11.graddelt.com
172.31.1.12 grind12.graddelt.com
[root@grind11 ha.d]# cat ha.cf
logfacility local0
debug 0
keepalive 1
deadtime 10
warntime 5
initdead 120
udpport 694
#ucast eth3 10.0.0.2
bcast eth3
auto_failback on
node grind11.graddelt.com
node grind12.graddelt.com
#ping 172.31.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
crm off
[root@grind11 ha.d]# cat haresources
grind11.graddelt.com \
ldirectord::ldirectord.cf \
LVSSyncDaemonSwap::master \
IPaddr2::172.31.1.10/24/eth1/172.31.1.255
[root@grind11 ha.d]# cat ldirectord.cf
checktimeout=10
checkinterval=2
autoreload=no
logfile="/var/log/ldirectord.log"
#logfile="local0"
quiescent=no
virtual=172.31.1.10:80
fallback=127.0.0.1:80
real=172.31.1.11:80 gate
real=172.31.1.12:80 gate
service=http
scheduler=rr
persistent=600
protocol=tcp
checktype=negotiate
request="ldtest.html"
receive="GRIND11"
NOTE: Files are the same on both nodes!
--------
Software
--------
CentOS 4.5
httpd-2.0.52-32.3.ent.centos4
heartbeat-2.1.2-3.el4.centos
heartbeat-pils-2.1.2-3.el4.centos
heartbeat-stonith-2.1.2-3.el4.centos
heartbeat-ldirectord-2.1.2-3.el4.centos
ipvsadm-1.24-6
------------------------------------
Before starting heartbeat on Grind11
------------------------------------
NOTE: Heartbeat is not yet started on Grind12!
[root@grind11 ~]# ip ad
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet 172.31.1.10/32 brd 172.31.1.255 scope global lo:0
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:18:71:e9:d0:d6 brd ff:ff:ff:ff:ff:ff
inet 10.1.156.11/24 brd 10.1.156.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:18:71:e9:d0:d5 brd ff:ff:ff:ff:ff:ff
inet 172.31.1.11/24 brd 172.31.1.255 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:0e:0c:c1:03:35 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:0e:0c:d7:de:ef brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/30 brd 10.0.0.3 scope global eth3
[root@grind11 ~]# ipvsadm
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
-----------------------------------
After starting heartbeat on Grind11
-----------------------------------
[root@grind11 ~]# tail -f /var/log/ha_log
Sep 26 10:48:44 grind11 heartbeat: [11850]: info: Version 2 support: off
Sep 26 10:48:44 grind11 heartbeat: [11850]: WARN: Logging daemon is
disabled --enabling logging daemon is recommended
Sep 26 10:48:44 grind11 heartbeat: [11850]: info: **************************
Sep 26 10:48:44 grind11 heartbeat: [11850]: info: Configuration
validated. Starting heartbeat 2.1.2
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: heartbeat: version 2.1.2
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: Heartbeat generation:
1190494128
Sep 26 10:48:44 grind11 heartbeat: [11851]: info:
G_main_add_TriggerHandler: Added signal manual handler
Sep 26 10:48:44 grind11 heartbeat: [11851]: info:
G_main_add_TriggerHandler: Added signal manual handler
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: Removing
/var/run/heartbeat/rsctmp failed, recreating.
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: glib: UDP Broadcast
heartbeat started on port 694 (694) interface eth3
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: glib: UDP Broadcast
heartbeat closed on port 694 interface eth3 - Status: 1Sep 26 10:48:44
grind11 heartbeat: [11851]: info: G_main_add_SignalHandler: Added signal
handler for signal 17
Sep 26 10:48:44 grind11 heartbeat: [11851]: info: Local status now set
to: 'up'
Sep 26 10:48:45 grind11 heartbeat: [11851]: info: Link
grind11.graddelt.com:eth3 up.
Sep 26 10:50:44 grind11 heartbeat: [11851]: WARN: node
grind12.graddelt.com: is dead
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: Comm_now_up():
updating status to active
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: Local status now set
to: 'active'
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: Starting child client
"/usr/lib/heartbeat/ipfail" (90,90)
Sep 26 10:50:44 grind11 heartbeat: [11851]: WARN: No STONITH device
configured.
Sep 26 10:50:44 grind11 heartbeat: [11851]: WARN: Shared disks are not
protected.
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: Resources being
acquired from grind12.graddelt.com.
Sep 26 10:50:44 grind11 heartbeat: [11860]: info: Starting
"/usr/lib/heartbeat/ipfail" as uid 90 gid 90 (pid 11860)
Sep 26 10:50:44 grind11 heartbeat: [11861]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 10:50:44 grind11 harc[11861]: info: Running /etc/ha.d/rc.d/status
status
Sep 26 10:50:44 grind11 ipfail: [11860]: debug: [We are
grind11.graddelt.com]
Sep 26 10:50:44 grind11 mach_down[11891]: info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Sep 26 10:50:44 grind11 mach_down[11891]: info: mach_down takeover
complete for node grind12.graddelt.com.
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: mach_down takeover
complete.
Sep 26 10:50:44 grind11 heartbeat: [11851]: info: Initial resource
acquisition complete (mach_down)
Sep 26 10:50:44 grind11 heartbeat: [11851]: debug:
StartNextRemoteRscReq(): child count 1
Sep 26 10:50:44 grind11 ipfail: [11860]: debug: auto_failback -> 1 (on)
Sep 26 10:50:44 grind11 ipfail: [11860]: debug: Setting message filter mode
Sep 26 10:50:44 grind11 heartbeat: [11862]: info: Local Resource
acquisition completed.
Sep 26 10:50:44 grind11 heartbeat: [11851]: debug:
StartNextRemoteRscReq(): child count 1
Sep 26 10:50:44 grind11 ipfail: [11860]: debug: Starting node walk
Sep 26 10:50:44 grind11 heartbeat: [11951]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 10:50:44 grind11 harc[11951]: info: Running
/etc/ha.d/rc.d/ip-request-resp ip-request-resp
Sep 26 10:50:44 grind11 ip-request-resp[11951]: received ip-request-resp
ldirectord::ldirectord.cf OK yes
Sep 26 10:50:44 grind11 ResourceManager[11972]: info: Acquiring resource
group: grind11.graddelt.com ldirectord::ldirectord.cf
LVSSyncDaemonSwap::master IPaddr2::172.31.1.10/24/eth1/172.31.1.255
Sep 26 10:50:45 grind11 ResourceManager[11972]: info: Running
/etc/ha.d/resource.d/ldirectord ldirectord.cf start
Sep 26 10:50:45 grind11 ResourceManager[11972]: debug: Starting
/etc/ha.d/resource.d/ldirectord ldirectord.cf start
Sep 26 10:50:45 grind11 ipfail: [11860]: debug: Cluster node:
grind12.graddelt.com: status: dead
Sep 26 10:50:45 grind11 ipfail: [11860]: debug: [They are
grind12.graddelt.com]
Sep 26 10:50:45 grind11 ipfail: [11860]: debug: Cluster node:
grind11.graddelt.com: status: active
Sep 26 10:50:45 grind11 ResourceManager[11972]: debug:
/etc/ha.d/resource.d/ldirectord ldirectord.cf start done. RC=0
Sep 26 10:50:45 grind11 ResourceManager[11972]: info: Running
/etc/ha.d/resource.d/LVSSyncDaemonSwap master start
Sep 26 10:50:45 grind11 ResourceManager[11972]: debug: Starting
/etc/ha.d/resource.d/LVSSyncDaemonSwap master start
Sep 26 10:50:45 grind11 LVSSyncDaemonSwap[12089]: info: ipvs_syncmaster up
Sep 26 10:50:45 grind11 LVSSyncDaemonSwap[12089]: info: ipvs_syncmaster
obtainedSep 26 10:50:45 grind11 ResourceManager[11972]: debug:
/etc/ha.d/resource.d/LVSSyncDaemonSwap master start done. RC=0
Sep 26 10:50:46 grind11 ipfail: [11860]: debug: Setting message signal
Sep 26 10:50:46 grind11 IPaddr2[12135]: INFO: Resource is stopped
Sep 26 10:50:46 grind11 ResourceManager[11972]: info: Running
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 start
Sep 26 10:50:46 grind11 ResourceManager[11972]: debug: Starting
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 start
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO: Removing conflicting
loopback lo.
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO: ip -f inet addr delete
172.31.1.10/32 dev lo
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO: ip -o -f inet addr show lo
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO: ip route delete
172.31.1.10 dev loSep 26 10:50:46 grind11 IPaddr2[12251]: INFO: ip -f
inet addr add 172.31.1.10/24 brd 172.31.1.255 dev eth1
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO: ip link set eth1 up
Sep 26 10:50:46 grind11 IPaddr2[12251]: INFO:
/usr/lib/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-172.31.1.10 eth1 172.31.1.10
auto not_used not_used
Sep 26 10:50:46 grind11 IPaddr2[12222]: INFO: Success
Sep 26 10:50:46 grind11 ResourceManager[11972]: debug:
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 start
done. RC=0
Sep 26 10:50:46 grind11 ipfail: [11860]: debug: Waiting for messages...
Sep 26 10:50:54 grind11 heartbeat: [11851]: info: Local Resource
acquisition completed. (none)
Sep 26 10:50:54 grind11 heartbeat: [11851]: info: local resource
transition completed.
[root@grind11 ~]# tail -f /var/log/ldirectord.log
[Wed Sep 26 10:50:44 2007|ldirectord.cf|11933] Invoking ldirectord
invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf status
[Wed Sep 26 10:50:44 2007|ldirectord.cf|11933] Exiting with exit_status
3: Exiting from ldirectord status
[Wed Sep 26 10:50:45 2007|ldirectord.cf|11999] Invoking ldirectord
invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf status
[Wed Sep 26 10:50:45 2007|ldirectord.cf|11999] Exiting with exit_status
3: Exiting from ldirectord status
[Wed Sep 26 10:50:45 2007|ldirectord.cf|12021] Invoking ldirectord
invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf start
[Wed Sep 26 10:50:45 2007|ldirectord.cf|12021] Starting Linux Director
v1.186-ha-2.1.2 as daemon
[Wed Sep 26 10:50:45 2007|ldirectord.cf|12023] Added virtual server:
172.31.1.10:80
[Wed Sep 26 10:50:45 2007|ldirectord.cf|12023] Added fallback server:
127.0.0.1:80 (172.31.1.10:80) (Weight set to 1)
[Wed Sep 26 10:50:46 2007|ldirectord.cf|12023] Added real server:
172.31.1.11:80 (172.31.1.10:80) (Weight set to 1)
[Wed Sep 26 10:50:46 2007|ldirectord.cf|12023] Deleted fallback server:
127.0.0.1:80 (172.31.1.10:80)
[root@grind11 ~]# ip addr
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:18:71:e9:d0:d6 brd ff:ff:ff:ff:ff:ff
inet 10.1.156.11/24 brd 10.1.156.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:18:71:e9:d0:d5 brd ff:ff:ff:ff:ff:ff
inet 172.31.1.11/24 brd 172.31.1.255 scope global eth1
inet 172.31.1.10/24 brd 172.31.1.255 scope global secondary eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:0e:0c:c1:03:35 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:0e:0c:d7:de:ef brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/30 brd 10.0.0.3 scope global eth3
[root@grind11 ~]# ipvsadm
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.31.1.10:http rr persistent 600
-> grind11.graddelt.com:http Local 1 0 0
-------
COMMENT
-------
I now would have expected a second entry in the output from lvsadm, e.g.:
-> grind12.graddelt.com:http Route 1 0 0
But it never ever shows up ... -(
I will now start heartbeat on the other node (grind12).
------------------------------------
Before starting heartbeat on Grind12
------------------------------------
[root@grind12 ~]# ip addr
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet 172.31.1.10/32 brd 172.31.1.255 scope global lo:0
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:06:5b:8c:0b:3a brd ff:ff:ff:ff:ff:ff
inet 10.1.156.12/24 brd 10.1.156.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:06:5b:8c:0b:3b brd ff:ff:ff:ff:ff:ff
inet 172.31.1.12/24 brd 172.31.1.255 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
link/ether 00:0e:0c:c5:ef:15 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:0e:0c:c1:00:fe brd ff:ff:ff:ff:ff:ff
inet 10.0.0.2/30 brd 10.0.0.3 scope global eth3
[root@grind12 ~]# tail -f /var/log/ha_log
Sep 26 11:09:44 grind12 heartbeat: [16610]: info: Version 2 support: off
Sep 26 11:09:44 grind12 heartbeat: [16610]: WARN: Logging daemon is
disabled --enabling logging daemon is recommended
Sep 26 11:09:44 grind12 heartbeat: [16610]: info: **************************
Sep 26 11:09:44 grind12 heartbeat: [16610]: info: Configuration
validated. Starting heartbeat 2.1.2
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: heartbeat: version 2.1.2
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: Heartbeat generation:
1190494141
Sep 26 11:09:44 grind12 heartbeat: [16611]: info:
G_main_add_TriggerHandler: Added signal manual handler
Sep 26 11:09:44 grind12 heartbeat: [16611]: info:
G_main_add_TriggerHandler: Added signal manual handler
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: Removing
/var/run/heartbeat/rsctmp failed, recreating.
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: glib: UDP Broadcast
heartbeat started on port 694 (694) interface eth3
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: glib: UDP Broadcast
heartbeat closed on port 694 interface eth3 - Status: 1Sep 26 11:09:44
grind12 heartbeat: [16611]: info: G_main_add_SignalHandler: Added signal
handler for signal 17
Sep 26 11:09:44 grind12 heartbeat: [16611]: info: Local status now set
to: 'up'
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Link
grind11.graddelt.com:eth3 up.
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Status update for node
grind11.graddelt.com: status active
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Link
grind12.graddelt.com:eth3 up.
Sep 26 11:09:45 grind12 heartbeat: [16618]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 11:09:45 grind12 harc[16618]: info: Running /etc/ha.d/rc.d/status
status
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Comm_now_up():
updating status to active
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Local status now set
to: 'active'
Sep 26 11:09:45 grind12 heartbeat: [16611]: info: Starting child client
"/usr/lib/heartbeat/ipfail" (90,90)
Sep 26 11:09:45 grind12 heartbeat: [16635]: info: Starting
"/usr/lib/heartbeat/ipfail" as uid 90 gid 90 (pid 16635)
Sep 26 11:09:46 grind12 heartbeat: [16611]: info: remote resource
transition completed.
Sep 26 11:09:46 grind12 heartbeat: [16611]: info: remote resource
transition completed.
Sep 26 11:09:46 grind12 heartbeat: [16611]: info: Local Resource
acquisition completed. (none)
Sep 26 11:09:46 grind12 ipfail: [16635]: debug: [We are
grind12.graddelt.com]
Sep 26 11:09:46 grind12 ipfail: [16635]: debug: auto_failback -> 1 (on)
Sep 26 11:09:46 grind12 heartbeat: [16611]: info: grind11.graddelt.com
wants to go standby [foreign]
Sep 26 11:09:46 grind12 ipfail: [16635]: debug: Setting message filter mode
Sep 26 11:09:47 grind12 heartbeat: [16611]: info: standby: acquire
[foreign] resources from grind11.graddelt.com
Sep 26 11:09:47 grind12 heartbeat: [16636]: info: acquire local HA
resources (standby).
Sep 26 11:09:47 grind12 ipfail: [16635]: debug: Starting node walk
Sep 26 11:09:47 grind12 ipfail: [16635]: debug: Cluster node:
grind12.graddelt.com: status: active
Sep 26 11:09:47 grind12 heartbeat: [16636]: info: local HA resource
acquisition completed (standby).
Sep 26 11:09:47 grind12 heartbeat: [16611]: info: Standby resource
acquisition done [foreign].
Sep 26 11:09:47 grind12 heartbeat: [16611]: info: Initial resource
acquisition complete (auto_failback)
Sep 26 11:09:48 grind12 heartbeat: [16611]: info: remote resource
transition completed.
Sep 26 11:09:48 grind12 ipfail: [16635]: debug: Cluster node:
grind11.graddelt.com: status: active
Sep 26 11:09:48 grind12 ipfail: [16635]: debug: [They are
grind11.graddelt.com]
Sep 26 11:09:48 grind12 ipfail: [16635]: debug: Setting message signal
Sep 26 11:09:48 grind12 ipfail: [16635]: debug: Waiting for messages...
Sep 26 11:09:49 grind12 ipfail: [16635]: debug: Other side is now stable.
Sep 26 11:09:49 grind12 ipfail: [16635]: debug: Other side is now stable.
Sep 26 11:09:51 grind12 ipfail: [16635]: debug: Got asked for num_ping.
Sep 26 11:09:51 grind12 ipfail: [16635]: info: Ping node count is balanced.
Sep 26 11:09:51 grind12 ipfail: [16635]: debug: Abort message sent.
Sep 26 11:09:52 grind12 ipfail: [16635]: info: Giving up foreign
resources (auto_failback).
Sep 26 11:09:52 grind12 ipfail: [16635]: info: Delayed giveup in 2 seconds.
Sep 26 11:09:52 grind12 ipfail: [16635]: debug: Other side is unstable.
Sep 26 11:09:53 grind12 ipfail: [16635]: debug: Other side is now stable.
Sep 26 11:09:53 grind12 ipfail: [16635]: debug: Other side is now stable.
Sep 26 11:09:54 grind12 ipfail: [16635]: info: giveup() called (timeout
worked)
Sep 26 11:09:54 grind12 ipfail: [16635]: debug: Message [ask_resources]
sent.
Sep 26 11:09:54 grind12 ipfail: [16635]: debug: giveup timeout has been
destroyed.
Sep 26 11:09:54 grind12 heartbeat: [16611]: info: grind12.graddelt.com
wants to go standby [foreign]
Sep 26 11:09:55 grind12 heartbeat: [16611]: info: standby:
grind11.graddelt.com can take our foreign resources
Sep 26 11:09:55 grind12 heartbeat: [16649]: info: give up foreign HA
resources (standby).
Sep 26 11:09:55 grind12 ResourceManager[16662]: info: Releasing resource
group: grind11.graddelt.com ldirectord::ldirectord.cf
LVSSyncDaemonSwap::master IPaddr2::172.31.1.10/24/eth1/172.31.1.255
Sep 26 11:09:55 grind12 ResourceManager[16662]: info: Running
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 stop
Sep 26 11:09:55 grind12 ResourceManager[16662]: debug: Starting
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 stop
Sep 26 11:09:55 grind12 IPaddr2[16700]: INFO: Success
Sep 26 11:09:55 grind12 ResourceManager[16662]: debug:
/etc/ha.d/resource.d/IPaddr2 172.31.1.10/24/eth1/172.31.1.255 stop done.
RC=0
Sep 26 11:09:55 grind12 ResourceManager[16662]: info: Running
/etc/ha.d/resource.d/LVSSyncDaemonSwap master stop
Sep 26 11:09:55 grind12 ResourceManager[16662]: debug: Starting
/etc/ha.d/resource.d/LVSSyncDaemonSwap master stop
Sep 26 11:09:55 grind12 LVSSyncDaemonSwap[16787]: info: ipvs_syncbackup up
Sep 26 11:09:55 grind12 LVSSyncDaemonSwap[16787]: info: ipvs_syncmaster
releasedSep 26 11:09:55 grind12 ResourceManager[16662]: debug:
/etc/ha.d/resource.d/LVSSyncDaemonSwap master stop done. RC=0
Sep 26 11:09:55 grind12 ResourceManager[16662]: info: Running
/etc/ha.d/resource.d/ldirectord ldirectord.cf stop
Sep 26 11:09:55 grind12 ResourceManager[16662]: debug: Starting
/etc/ha.d/resource.d/ldirectord ldirectord.cf stop
Sep 26 11:09:56 grind12 ResourceManager[16662]: debug:
/etc/ha.d/resource.d/ldirectord ldirectord.cf stop done. RC=0
Sep 26 11:09:56 grind12 heartbeat: [16649]: info: foreign HA resource
release completed (standby).
Sep 26 11:09:56 grind12 heartbeat: [16611]: info: Local standby process
completed [foreign].
Sep 26 11:09:58 grind12 heartbeat: [16611]: WARN: 1 lost packet(s) for
[grind11.graddelt.com] [1296:1298]
Sep 26 11:09:58 grind12 heartbeat: [16611]: info: remote resource
transition completed.
Sep 26 11:09:58 grind12 heartbeat: [16611]: info: No pkts missing from
grind11.graddelt.com!
Sep 26 11:09:58 grind12 heartbeat: [16611]: info: Other node completed
standby takeover of foreign resources.
Sep 26 11:09:58 grind12 ipfail: [16635]: debug: Other side is now stable.
[root@grind12 ~]# tail -f /var/log/ldirectord.log
[Wed Sep 26 11:09:44 2007|ldirectord.cf|16597] Invoking ldirectord
invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf status
[Wed Sep 26 11:09:44 2007|ldirectord.cf|16597] Exiting with exit_status
3: Exiting from ldirectord status
[Wed Sep 26 11:09:56 2007|ldirectord.cf|16846] Invoking ldirectord
invoked as: /etc/ha.d/resource.d/ldirectord ldirectord.cf stop
[root@grind11 ~]# tail -f /var/log/ha_log
Sep 26 11:09:45 grind11 heartbeat: [11851]: info: Link
grind12.graddelt.com:eth3 up.
Sep 26 11:09:45 grind11 heartbeat: [11851]: info: Status update for node
grind12.graddelt.com: status init
Sep 26 11:09:45 grind11 heartbeat: [11851]: info: Status update for node
grind12.graddelt.com: status up
Sep 26 11:09:45 grind11 heartbeat: [11851]: debug:
StartNextRemoteRscReq(): child count 1
Sep 26 11:09:45 grind11 heartbeat: [11851]: debug: get_delnodelist:
delnodelist=
Sep 26 11:09:45 grind11 ipfail: [11860]: info: Link Status update: Link
grind12.graddelt.com/eth3 now has status up
Sep 26 11:09:45 grind11 ipfail: [11860]: info: Status update: Node
grind12.graddelt.com now has status init
Sep 26 11:09:45 grind11 heartbeat: [15508]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 11:09:45 grind11 ipfail: [11860]: info: Status update: Node
grind12.graddelt.com now has status up
Sep 26 11:09:45 grind11 harc[15508]: info: Running /etc/ha.d/rc.d/status
status
Sep 26 11:09:45 grind11 heartbeat: [15525]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 11:09:45 grind11 harc[15525]: info: Running /etc/ha.d/rc.d/status
status
Sep 26 11:09:46 grind11 heartbeat: [11851]: info: Status update for node
grind12.graddelt.com: status active
Sep 26 11:09:46 grind11 ipfail: [11860]: info: Status update: Node
grind12.graddelt.com now has status active
Sep 26 11:09:46 grind11 heartbeat: [15541]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Sep 26 11:09:46 grind11 ipfail: [11860]: debug: Got join message from
another ipfail client. (grind12.graddelt.com)
Sep 26 11:09:46 grind11 harc[15541]: info: Running /etc/ha.d/rc.d/status
status
Sep 26 11:09:46 grind11 heartbeat: [11851]: info: remote resource
transition completed.
Sep 26 11:09:46 grind11 heartbeat: [11851]: info: grind11.graddelt.com
wants to go standby [foreign]
Sep 26 11:09:46 grind11 ipfail: [11860]: info: Asking other side for
ping node count.
Sep 26 11:09:46 grind11 ipfail: [11860]: debug: Message [num_ping] sent.
Sep 26 11:09:46 grind11 ipfail: [11860]: debug: Other side is unstable.
Sep 26 11:09:47 grind11 ipfail: [11860]: debug: Other side is now stable.
Sep 26 11:09:47 grind11 heartbeat: [11851]: info: standby:
grind12.graddelt.com can take our foreign resources
Sep 26 11:09:47 grind11 heartbeat: [15557]: info: give up foreign HA
resources (standby).
Sep 26 11:09:47 grind11 heartbeat: [15557]: info: foreign HA resource
release completed (standby).
Sep 26 11:09:47 grind11 heartbeat: [11851]: info: Local standby process
completed [foreign].
Sep 26 11:09:47 grind11 heartbeat: [11851]: WARN: 1 lost packet(s) for
[grind12.graddelt.com] [13:15]
Sep 26 11:09:47 grind11 heartbeat: [11851]: info: remote resource
transition completed.
Sep 26 11:09:47 grind11 heartbeat: [11851]: info: No pkts missing from
grind12.graddelt.com!
Sep 26 11:09:47 grind11 heartbeat: [11851]: info: Other node completed
standby takeover of foreign resources.
Sep 26 11:09:47 grind11 ipfail: [11860]: debug: Other side is now stable.
Sep 26 11:09:48 grind11 ipfail: [11860]: debug: Other side is now stable.
Sep 26 11:09:52 grind11 ipfail: [11860]: info: No giveup timer to abort.
Sep 26 11:09:54 grind11 heartbeat: [11851]: info: grind12.graddelt.com
wants to go standby [foreign]
Sep 26 11:09:55 grind11 ipfail: [11860]: debug: Other side is unstable.
Sep 26 11:09:56 grind11 heartbeat: [11851]: info: standby: acquire
[foreign] resources from grind12.graddelt.com
Sep 26 11:09:56 grind11 heartbeat: [15570]: info: acquire local HA
resources (standby).
Sep 26 11:09:56 grind11 ResourceManager[15583]: info: Acquiring resource
group: grind11.graddelt.com ldirectord::ldirectord.cf
LVSSyncDaemonSwap::master IPaddr2::172.31.1.10/24/eth1/172.31.1.255
Sep 26 11:09:57 grind11 ResourceManager[15583]: info: Running
/etc/ha.d/resource.d/ldirectord ldirectord.cf start
Sep 26 11:09:57 grind11 ResourceManager[15583]: debug: Starting
/etc/ha.d/resource.d/ldirectord ldirectord.cf start
Sep 26 11:09:57 grind11 ResourceManager[15583]: debug:
/etc/ha.d/resource.d/ldirectord ldirectord.cf start done. RC=0
Sep 26 11:09:57 grind11 IPaddr2[15678]: INFO: Running OK
Sep 26 11:09:57 grind11 heartbeat: [15570]: info: local HA resource
acquisition completed (standby).
Sep 26 11:09:57 grind11 heartbeat: [11851]: info: Standby resource
acquisition done [foreign].
Sep 26 11:09:58 grind11 heartbeat: [11851]: info: remote resource
transition completed.
Sep 26 11:09:58 grind11 ipfail: [11860]: debug: Other side is now stable.
[root@grind11 ~]# ipvsadm
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.31.1.10:http rr persistent 600
-> grind11.graddelt.com:http Local 1 0 0
-------
COMMENT
-------
Lvsadm still doesn't show the second real node .. -)
[john@delta ~]$
|