LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: Expected Failover Time and Configuration Limits.

To: "'LinuxVirtualServer.org users mailing list.'" <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: Expected Failover Time and Configuration Limits.
From: Peter Mueller <pmueller@xxxxxxxxxxxx>
Date: Mon, 15 Sep 2003 16:46:27 -0700
Hello,

Plaintext to mailing lists please.

> We are running a dual lvs-NAT setup (2x dual 733 dell systems with 1gb
ram)
> Redhat 9 install with modified kernel. (uname -r  >
2.4.20-18.7.hidd.ipvs109.cipe154smp)
> ipvsadm --version > ipvsadm v1.21 2002/11/12 (compiled with popt and IPVS
v1.0.9)
> We are running heartbeat+mon for failover between directors.
> Our haresources file currently houses 144 IP addresses.

Ok, on all this.

> We are also running ospf and zebra on the directors.

FYI you might want to use quagga (www.quagga.net) for OSPF.  Zebra has
forked and OSPF is AFAIK better supported under there.  It shouldn't make a
difference for failover time, but stability and bugfixes might be better
with Quagga.

> During our early testing and early production our failover
> was virtually unnoticable.  Since we have added ospf and zebra
> as well as the majority of those entries in our haresources
> file our failover time has hit somewhere around +/-10 minutes.
> Can anybody tell me if that would be normal for the ammount of
> resources we are failing over or if it hints to a possible problem?

That is extremely high.  It should be nowhere near that.  How are you
measuring the failover interval?

> Is there an easy way to tell if the grat. arp is working?

Tcpdump is your friend :)

> Also we noticed something that seemed quite strange
> on the virtual interfaces (eth0:##) after about eth0:42
> it begins skiping every other interface number 
>(eth0:42,eth0:44,eth0:46...) and then sometime later
> changes to the odd number interfaces and skips the evens.

What version of heartbeat are you running?  Can you post your .conf files?

> I can provide any information you might need to assist you in helping us
out.

Please provide tcpdumps from a client and a director starting right when you
initiate a failover (1 minute or so) to right before it's done (1 minute).

P
<Prev in Thread] Current Thread [Next in Thread>