LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] recommendations on stonith?

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] recommendations on stonith?
From: Dan Yocum <yocum@xxxxxxxx>
Date: Wed, 12 Dec 2007 10:52:40 -0600
Hi Joe,

I think you're reinforcing my back-of-the-envelope risk assessment - the 
risk is low, but the impact would be high, in our environment.  Also, 
adding stonith would certainly add another layer of complexity, and 
potentially more points of failure.

But, let me ask this pointed question: has anyone ever experienced, or 
heard of an incident, where both the active and passive director went 
insane and each became active, bringing up the VIPs on their interfaces 
(i.e., they both respond to arp requests from the router)?  This is my 
"biggest" concern and it's not that big to begin with.  This would be in 
a direct routing configuration, I'm not concerned with NAT or TUN.

Thanks,
Dan


Joseph Mack NA3T wrote:
> On Mon, 10 Dec 2007, Dan Yocum wrote:
> 
>> What are your recommendations on stonith and LVS director 
>> failovers?  Is it useful or not?
> 
> people ran without it for years. But then people didn't have 
> good backups back then either. How important is your setup: 
> are you hosting a 1G$ or 1k$ business setup? Is it to run 
> unattended or will people be looking at logs? Do you run 
> smartmon on your disks and pre-emptively remove disks at 
> 2yrs (even if they're working perfectly) or do you let them 
> fail? Do you failout your fans after a year or so? Are you 
> rrunning 5 9's or 1 9?
> 
> High end commodity hardware isn't too bad nowadays and 
> pre-emptive removal of parts that spin/move helps a lot. It 
> seems like many of the failures are stupidity (pulling 
> plugs, the ISP replaces/reconfigures the router) and no 
> amount of stonith will fix that.
> 
> Do you trust stonith? Is it a factor of 10 more reliable 
> than the failures you expect?
> 
> Joe

-- 
Dan Yocum
Fermilab  630.840.6509
yocum@xxxxxxxx, http://fermigrid.fnal.gov
Fermilab.  Just zeros and ones.


<Prev in Thread] Current Thread [Next in Thread>