On Wed, 2 Mar 2005 Mack.Joseph@xxxxxxxxxxxxxxx wrote:
> sure. It's a nightmare of scripts that will probably fail under some
> condition you don't expect and will be impossible for someone else
> to debug. An idle server is a small price to pay for robustity.
> In the early days of LVS we thought it would be trivial to promote
> a realserver to be a director when the director failed, by running
> a few setup scripts. While simple in concept, it turned out that it
> was simpler just to have a standby director just parked there
> waiting to take over. Brute force and low tech usually win when
> reliability is required.
I've got a 3-node DNS system using LVS-DR, where all 3 nodes are directors and
realservers simultaneously. I'm using keepalived to manage it all and do the
failover, with a single script running when keepalived transitions from MASTER
- BACKUP or FAULT and back again.
It uses iptables to add an fwmark on the incoming requests, then uses the
fwmark check for the LVS. Basic configuration is as follows:
global_defs {
<snipped notifications>
lvs_id DNS02
}
static_routes {
# backend managment LAN
1.2.0.0/16 via 1.2.0.126 dev eth0
}
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! VRRP synchronisation
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
vrrp_sync_group SYNC1 {
group {
DNS_OUT
GW_IN
}
}
vrrp_instance DNS_1 {
state MASTER
interface eth0
track_interface {
eth1
}
lvs_sync_daemon_interface eth0
virtual_router_id 111
priority 100
advert_int 5
smtp_alert
virtual_ipaddress {
5.6.7.1 dev eth1
5.6.7.2 dev eth1
}
virtual_ipaddress_excluded {
5.6.7.8 dev eth1
5.6.7.9 dev eth1
}
virtual_routes {
}
notify_master "/usr/local/bin/transitions MASTER"
notify_backup "/usr/local/bin/transitions BACKUP"
notify_fault "/usr/local/bin/transitions FAULT"
}
vrrp_instance GW_IN {
state MASTER
garp_master_delay 10
interface eth0
track_interface {
eth0
}
lvs_sync_interface eth0
virtual_router_id 11
priority 100
advert_int 5
smtp_alert
virtual_ipaddress {
1.2.0.125 dev eth0
}
virtual_routes {
}
}
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! DNS TCP
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
virtual_server fwmark 5 {
smtp_alert
delay_loop 30
lb_algo wlc
lb_kind DR
persistence_timeout 0
protocol TCP
real_server 1.2.0.2 53 {
weight 10
inhibit_on_failure
TCP_CHECK {
connect_timeout 10
connect_port 53
}
MISC_CHECK {
misc_path "/usr/bin/dig @1.2.0.2 -p 53 known_zone soa"
misc_timeout 10
}
}
<snip other realservers>
<snip UDP realservers>
...Where /usr/local/bin/transitions is:
#!/bin/bash
IPLIST="/etc/resolver_ips"
IPCMD="/sbin/ip addr"
if [ ! -f $IPLIST ]
then
echo "No resolver list found, exiting"
exit 127
fi
if [ $1 ]
then
SWITCH=$1
else
# No command, quit
echo "No command given, exiting"
exit 126
fi
if [ $SWITCH = "MASTER" ]
then
DO="del"
elif [ $SWITCH = "BACKUP" -o $SWITCH = "FAULT" ]
then
DO="add"
else
# No command, quit
echo "Invalid command given, exiting"
exit 126
fi
if [ $DO = "add" ]
then
# we cycle through and make the IPs in /etc/resolver_ips loopback live
# We're in BACKUP or FAULT here
for addr in `cat $IPLIST`
do
$IPCMD $DO $addr dev lo
done
/sbin/route del -net 5.6.7.0 netmask 255.255.255.0 dev eth1
/usr/bin/killall -HUP named
elif [ $DO = "del" ]
then
# we do the reverse
# We're in MASTER here
for addr in `cat $IPLIST`
do
echo $IPCMD $DO $addr dev lo
done
/sbin/route add -net 5.6.7.0 netmask 255.255.255.0 dev eth1
/usr/bin/killall -HUP named
else
echo "Something is wrong, exiting"
exit 125
fi
### EOF /usr/local/bin/transitions
...and /etc/resolver_ips contains:
5.6.7.1/32
5.6.7.2/32
5.6.7.3/32
5.6.7.4/32
...and in /etc/sysctl.conf we have (amongst other things):
# Don't hide mac addresses from arping out of each interface
net.ipv4.conf.all.arp_filter = 0
# Enable configuration of hidden devices
net.ipv4.conf.all.hidden = 1
# Make the loopback device hidden
net.ipv4.conf.lo.hidden = 1
So we have a single MASTER and two BACKUP directors in normal operation, where
the MASTER has "resolver" IP addresses on its' "external" NIC, and the BACKUP
directors have them on the loopback adapter. Upon failover, the transitions
script moves them from loopback to NIC or vice-versa.
The DNS server processes themselves are serving in excess of 880000 zones
using the DLZ patch to BIND so startup times for the cluster as a whole are
really very short (it can be cold-started in a matter of minutes).
In practice the system can cope with many thousands of queries per minute
without breaking a sweat, and fails over from server to server without a
problem.
You might think that this is an unmanageable methodology and is impossible to
understand, but I think it works rather well :)
Graeme
|