Hi all.
I'm trying to build a little cluster (two hosts) to serve mail (smtp,
pop, imap) and web services.
At the beginnings everythings worked without many problems...but now i
see strange ldirectord behavior:
This is host 1:
srv-cluster-1:~# ./test_cluster.sh
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:e0:18:fe:35:7d brd ff:ff:ff:ff:ff:ff
inet 192.168.30.120/24 brd 192.168.30.255 scope global eth0
inet6 fe80::2e0:18ff:fefe:357d/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:50:04:37:76:95 brd ff:ff:ff:ff:ff:ff
inet 192.168.30.121/24 brd 192.168.30.255 scope global eth1
inet6 fe80::250:4ff:fe37:7695/64 scope link
valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:19:5b:5d:41:6e brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/24 brd 10.0.0.255 scope global eth2
inet6 fe80::219:5bff:fe5d:416e/64 scope link
valid_lft forever preferred_lft forever
5: sit0: <NOARP> mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
ldirectord is stopped for /etc/ha.d/ldirectord.cf
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
master stopped
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@srv-cluster-1, 2007-07-23 18:48:08
0: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
and this host 2:
srv-cluster-2:~# ./test_cluster.sh
1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:1b:78:57:19:60 brd ff:ff:ff:ff:ff:ff
inet 192.168.30.122/24 brd 192.168.30.255 scope global eth0
inet 192.168.30.253/24 brd 192.168.30.255 scope global secondary eth0
inet6 fe80::21b:78ff:fe57:1960/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:1b:78:57:19:61 brd ff:ff:ff:ff:ff:ff
inet 192.168.30.123/24 brd 192.168.30.255 scope global eth1
inet6 fe80::21b:78ff:fe57:1961/64 scope link
valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:1b:78:37:bc:49 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.2/24 brd 10.0.0.255 scope global eth2
inet6 fe80::21b:78ff:fe37:bc49/64 scope link
valid_lft forever preferred_lft forever
5: sit0: <NOARP> mtu 1480 qdisc noop
link/sit 0.0.0.0 brd 0.0.0.0
ldirectord stale pid file /var/run/ldirectord.ldirectord.cf.pid for
/etc/ha.d/ldirectord.cf
ldirectord is stopped for /etc/ha.d/ldirectord.cf
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.30.253:143 wlc persistent 600
-> 192.168.30.120:143 Masq 0 0 0
-> 192.168.30.122:143 Local 0 0 0
-> 127.0.0.1:143 Local 0 0 0
TCP 192.168.30.253:80 rr
-> 192.168.30.122:80 Local 1 0 0
-> 192.168.30.120:80 Route 1 0 0
TCP 192.168.30.253:25 wlc persistent 600
-> 192.168.30.120:25 Tunnel 1 0 0
-> 192.168.30.122:25 Local 1 0 0
master running
(ipvs_syncmaster pid: 3698)
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@srv-cluster-2, 2007-07-23 10:45:53
0: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate C r---
ns:0 nr:28672 dw:28672 dr:0 al:0 bm:12 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:1786 misses:6 starving:0 dirty:0 changed:6
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
If i try to restart heartbeat (that should restart also ldirectord) the
i get both of the nodes on the same situation:
ldirectord stale pid file /var/run/ldirectord.ldirectord.cf.pid for
/etc/ha.d/ldirectord.cf
ldirectord is stopped for /etc/ha.d/ldirectord.cf
master stopped
here my confs:
srv-cluster-1:~# cat /etc/ha.d/ha.cf
logfacility local0
bcast eth2 # Linux
mcast eth2 225.0.0.1 694 1 0
auto_failback off
node srv-cluster-1
node srv-cluster-2
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
srv-cluster-1:~# cat /etc/ha.d/haresources
srv-cluster-1 \
ldirectord::ldirectord.cf \
LVSSyncDaemonSwap::master \
IPaddr2::192.168.30.253/24/eth0/192.168.30.255
srv-cluster-1:~# cat /etc/ha.d/ldirectord.cf
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes
virtual=192.168.30.253:80
real=192.168.30.120:80 gate
real=192.168.30.122:80 gate
fallback=127.0.0.1:80 gate
service=http
request="ldirector.html"
receive="Test Page"
scheduler=rr
protocol=tcp
checktype=negotiate
virtual=192.168.30.253:25
real=192.168.30.120:25 ipip
real=192.168.30.122:25 ipip
fallback=127.0.0.1:25
service=smtp
scheduler=wlc
persistent=600
protocol=tcp
virtual=192.168.30.253:143
real=127.0.0.1:143 masq
real=192.168.30.120:143 masq
real=192.168.30.122:143 masq
fallback=127.0.0.1:143
service=imap
scheduler=wlc
#login="test"
#passwd="test"
persistent=600
persistent=600
protocol=tcp
Obviously both nodes have the same configuration.
Thanks all
Pier
------
"I disapprove of what you say, but I will defend to the death your right
to say it"
Evelyn Beatrice Hall
|