Additionnal info on RedPaper REDP-3657-00

To:	vic.cross@xxxxxxxxxxxxx, keepalived-announce@xxxxxxxxxxxxxxxxxxxxx
Subject:	Additionnal info on RedPaper REDP-3657-00
Cc:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Cc:	keepalived-devel@xxxxxxxxxxxxxxxxxxxxx
Cc:	LINUX-390@xxxxxxxxxxxxx
From:	Alexandre Cassen <alexandre.cassen@xxxxxxxxxx>
Date:	Thu, 28 Aug 2003 12:59:19 +0200 (CEST)

(sorry for the repost)

Hi All, Hi Vic,

While browsing threw redbook search index I found that
Vic Cross published a new redbook focused on Keepalived/VRRP :

IBM Form Number : REDP-3657-00

Title : "Linux on IBM  zSeries and S/390: Virtual Router
         Redundancy Protocol on VM Guest LANs"

URL : 
http://publib-b.boulder.ibm.com/Redbooks.nsf/RedpaperAbstracts/redp3657.html?Open

Since no announce has been spred to the Keepalived mailing, I
just popup to inform on this event. This paper is very good
with indepth configurations and valuables informations on some
case studies. So just want to say here good work Vic to you
and persons who helped writing this paper.

I would just like to give to the community some kind of comments
and complements on some part of the document.

1) Page.8/Dependencies/ETHTOOL_GLINK : Link media failure detection

    As a routing protocol, VRRP need to take a special care to media
  link it run on in order to detect low level outage. This state
  is not present into the current RFC (we speak about it in the
  VRRP IETF WG mailing list, but no change has been done). But
  considering coding reality this state is needed. This is why
  Keepalived/VRRP code introduce and use an extended VRRP FSM
  using the state called "FAULT". This state simply reflect to
  the routing daemon the fact that low level is no longer working
  which force a VRRP protocol state transition to this FAULT state
  until low level is back to job.
    To sum the FSM present in Keepalived/VRRP code is :

                          +---------------+
                          |               |
         +--------------->|     Fault     |<---------------+
         |                |               |                |
         |                +---------------+                |
         |                      |   ^                      |
         |                      v   |                      |
         |                +---------------+                |
         |    +---------->|               |<----------+    |
         |    |           |  Initialize   |           |    |
         |    |    +------|               |------+    |    |
         |    |    |      +---------------+      |    |    |
         |    |    |                             |    |    |
         |    |    V                             V    |    |
      +---------------+                       +---------------+
      |               |---------------------->|               |
      |    Master     |                       |    Backup     |
      |               |<----------------------|               |
      +---------------+                       +---------------+

    To detect link media failure linux kernel provide 3 differents
  channel to test for link activity depending how the NIC driver
  has been coded. To provide full support to those method, the
  strategy present in Keepalived is a timer thread checking
  each second link media status on each NIC. To test it use :
  o MII probe : test if MII-reg are accessible threw
    SIOCGMIIPHY interface, if yes probe for BMSR reg
    (cf: donald becker driver homepage for more infos).
  o ETHTOOL_GLINK : if MII probe fails then test if ethtool
    is supported. If yes test for ETHTOOL_GLINK.
  o ioctl reflection : if both previous probe fails then
    simply register a SIOCGIFFLAGS to reflect IFF_RUNNING,
    IFF_UP, ... (cause kernel netlink code doesn't broadcast
    to userspace ifflags updates, or some driver yes and others
    not, ... )

  The code here assume that Jeff Garzik ETHTOOL is part of the
  kernel and doesn't test for it during autoconf stage. So reading
  the redpaper, it seems that linux kernel for zSeries doesn't 
  support GLINK ethtool API this is why compilation fails...
  I put this into my todo list to bypass the ETHTOOL_GLINK
  probe if it is not supported by kernel.

  Also I can read : "the keepalived code compiles cleanly and works
  well but experimences a startup delay problem". Please elaborate
  this startup delay problem.


2) Page.24/VRRP experiencs on Guest LANs/Virtual MAC address

  Since Linux Kernel doesn't support multiple MAC per NIC (I mean
  API abstraction). The Keepalived/VRRP code doesn't support
  VRRP VMAC. We had a thread on the netdev list a time ago
  (cf: http://oss.sgi.com/projects/netdev/archive/netdev.2003-01,
  search for "Re: SIOCADDMULTI for unicast broken"). Jamal proposed
  a solution based on Traffic Eng at ingress stage... Work need
  to be done. BTW, Keepalived/VRRP use Gratuitous ARP strategy
  during IP takeover which is quite enought for most of env.

3) Page.28/Failure modes

=> "only a dropped packet or two between...state transition" :
   cause no VMAC support gratuitous ARP strategy update remote
   routing equipment with new MAC which just broke connected
   connection.

=> "Configuring an interface down" : if perform ifdown then
   IFF_DOWN is reflected to Keepalived/VRRP code which drive
   a state transition to FAULT state. after, ifup called then
   IFF_UP is reflected and transition to init state. After
   each fallback transition from FAULT to MASTER or BACKUP
   Keepalived/VRRP code creaets a new socket and perform a new
   multicast ADD_MEMBERSHIP. If mcast address is not registered
   it looks like a bug into the zSerie Linux Kernel, an IGMP
   issue ...??... 
  
=> "Unplugging a simulated NIC" : Send cable pull event, then
   no traffic pass. "Linux still believes the interface is up",
   if this pull event is not reflected to linux kernel and
   if linux kernel doesn't update ifflags or ethtool_glink
   then this break the VRRP coherence, as explained into this
   paragraph, as link media failure is not detected by kernel.
   And if kernel doesn't detect, it simply can not reflect this
   event to any userspace routing daemon, we will have the same
   side effect.

=> "AH troubleshooting" : There were a meeting in SF during the
   Q1-2003 and VRRP IETF WG stated on this part. They simply
   decided to remove IPSEC-AH auth from the RFC. In near futur
   this will no longer be part of RFC. OTOH, since lot of
   work has been done in Keepalived for this support, I keep it
   in the code. But I must warn that this strong auth must be
   considered working for simple configuration, instance not
   part of a sync_group. I worked to add support to sync_group
   but there is some scenario that introduce latency during
   IPSEC sequence number synchronization (cf: draft on keepalived
   website).


Best regards,
Alexandre

<Prev in Thread]	Current Thread	[Next in Thread>
Additionnal info on RedPaper REDP-3657-00, Alexandre Cassen <=

Previous by Date:	[no subject], Alexandre Cassen
Next by Date:	Re: LVS-DR where Directors are also Realservers, Joseph Mack
Previous by Thread:	[no subject], Alexandre Cassen
Next by Thread:	Why does that happen?, helio
Indexes:	[Date] [Thread] [Top] [All Lists]