* Joseph Mack (mack.joseph@xxxxxxx) [020523 06:59]:
>
> Does anyone have director failover in a production setup, when
> directors have failed and changed over automatically and you've
> only found out about it when the failover mechanism notified
> you (rather than the customers notifying you)?
>
> If so, I'd like to put the info into the HOWTO. Please let
> me know what failed, what system you're using for failover,
> and anything else of interest.
Yes. We run a pair of load balancers in front of 3 (well,
now 5) real http/https webservers, using keepalived.
In earlier versions of LVS, a memory leak problem caused a
failover to occur about once every three days (might have been
0.9.8 with keepalived 0.4.9 + local patches).
We're on 1.0.2 and 0.5.6 now, with no problems, except that
we don't quite have an auto failback mechanism that works
correctly.
We preserve connections quite nicely during the failover
from the master to the backup, however once in that state,
if the master comes back up, it takes over without capturing
the connection states from the backup. I believe that
Alexandre is close to solving this if he hasn't already;
frankly we've been concentrating on other pieces of our
infrastructure, and since we've had no failures since we
upgraded versions, we haven't been keeping up.
We're relatively small, serving up between .5 and 2.5
T1's worth of traffic. The balancers are built from
Dell 2350s with 600Mhz PIII and 128MB, with DE570TX
quad tulip cards in each.
We run NAT, with an external interface that provides
a non-routable IP address (there's a separate firewall
up front before the web cluster), an internal interface
to our web servers, and internal interface to our
admin / backup network, and an interface on a crossover
cable to the other balancer used for connection sync
data. We could consolidate some of these, but since
NICs are cheap, it keeps everything conceptually simple
and easy to sniff to prove it's clean.
When we hit our next traffic plateau, we'll likely
move to DR.
Big thanks to all of you.
-Brad
|