LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: question on faq 4.18

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: question on faq 4.18
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Sun, 22 Jan 2006 13:28:51 +0100
Hello,

Before CARP existed I had hoped to use Alexandre's keepalived as a VRRPd to control failover on a pair of routers running services that I wanted to failover with the routers, eg dhcpd, squid, firewall, but keepalived was too tied to LVS to do it easily.

On top of that I believe there is still a patent issue with regard to VRRP. Cisco is holding it. You could try to use other routing protocols to achieve router failover, such as HSRP or OSPF.

I knew I could do it with Horms' code. I just wanted to try it with VRRPd. I never revisited the issue when CARP came along. Maybe it can do what I want.

I never played with ucarp (Linux version), however the OpenBSD version works like a charm. The thing here is that lots of people use linux-ha to set up high available servers but linux-ha does not have (yet) a means to properly detect network failures. Ping just does not cut it :).

If someone does something with VRRRP/CARP and LVS it would be nice if the VRRP wasn't directly tied to LVS, ie the VRRRP could be used for other failovers as well.

Well, these are just router protocols that exchange state information and decide who is master, backup or renter. linux-ha provides enough of infrastructure to build all kinds of failover setups. It would be nice if dynamic routing protocols were better integrated into linux-ha.

I've just recently set up a 2208 switch using one VSRs and 2 VIRs, doing failover when either the link or the DGW is not reachable anymore. The sexy thing about this setup is that you don't need to fiddle around with arp problems and you don't need to have NAT, so balancing schedulers can get meaningful L7 information.

how do they detect media/gw failure?

According to RFC816 and RFC1122 there are multiple ways to perform DGD, however I've only seen about 3 of those in the wild:

                 o    Link-layer information that reliably detects and
                      reports host failures (e.g., ARPANET Destination
                      Dead messages) should be used as negative advice.

                 o    An ICMP Redirect message from a particular gateway
                      should be used as positive advice about that
                      gateway.

                 o    Packets arriving from a particular link-layer
                      address are evidence that the system at this
                      address is alive.  However, turning this
                      information into advice about gateways requires
                      mapping the link-layer address into an IP address,
                      and then checking that IP address against the
                      gateways pointed to by the route cache.  This is
                      probably prohibitively inefficient.

The Alteon switch does media detection and could also listen to special L2 PDU packets, including advertisements. Media detection under Linux is an often discussed and to date not resolved issue. For about 2 months starting last November, a couple of people on netdev have been working on proper link state propagation in the core kernel, the result will be seen in 2.6.17 ;). Other than that I suggest you use non-cheap but excellently supported NICs, like e1000 and check the media state using ethtool or write a netlink listener.

(
note: - You aren't allowed to ping - RFC1122

You are allowed, but only iif nothing else works for you (3.3.1.4):

            *    Active probes such as "pinging" (i.e., using an ICMP
                 Echo Request/Reply exchange) are expensive and scale
                 poorly.  In particular, hosts MUST NOT actively check
                 the status of a first-hop gateway by simply pinging the
                 gateway continuously.

            *    Even when it is the only effective way to verify a
                 gateway's status, pinging MUST be used only when
                 traffic is being sent to the gateway and when there is
                 no other positive indication to suggest that the
                 gateway is functioning.

            *    To avoid pinging, the layers above and/or below the
                 Internet layer SHOULD be able to give "advice" on the
                 status of route cache entries when either positive
                 (gateway OK) or negative (gateway dead) information is
                 available.

http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.dynamic_routing.html#dead_gateway

Reading rfc1122 was faster :).

Cheers Joe,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread] Current Thread [Next in Thread>