LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] recommendations on stonith?

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] recommendations on stonith?
From: Graeme Fowler <graeme@xxxxxxxxxxx>
Date: Wed, 12 Dec 2007 17:09:44 +0000
On Wed, 2007-12-12 at 10:52 -0600, Dan Yocum wrote:
> But, let me ask this pointed question: has anyone ever experienced, or 
> heard of an incident, where both the active and passive director went 
> insane and each became active, bringing up the VIPs on their interfaces 
> (i.e., they both respond to arp requests from the router)?

Yes, I have.

It was a complex network where the two keepalived directors were each
connected to a different Cisco Cat6509 switch with a multi-port gig
interconnect between the two carrying all the VLANs - essentially one
big switch in two parts.

In turn, the Cats were connected to different upstream routers (which in
turn were cross-connected). This was designed to be a very robust
network - bits could fail but the packets would route or switch around
the failure...

...only on one occasion, the gig interconnect went bananas and
segregated the two Cats. This mean the VRRP announcements went
undetected, so both directors became MASTER - at this point very strange
things happened, since as MASTER they both became the default gateway
for traffic leaving the cluster (this was a NAT setup). The routers
could see ARP flip-flops, but the Cats couldn't.

All very messy. In order to fix it temporarily I had to do a STONITH of
sorts, by stopping keepalived on one director.

All that said, it wasn't the fault of either director - it was my design
and reliance on a network with a level of complexity that meant the
condition was possible. Since then I've tried to keep the announcement
interfaces as close to each other as possible!

Graeme



<Prev in Thread] Current Thread [Next in Thread>