(sorry for the repost)
Hi All, Hi Vic,
While browsing threw redbook search index I found that
Vic Cross published a new redbook focused on Keepalived/VRRP :
IBM Form Number : REDP-3657-00
Title : "Linux on IBM zSeries and S/390: Virtual Router
Redundancy Protocol on VM Guest LANs"
URL :
http://publib-b.boulder.ibm.com/Redbooks.nsf/RedpaperAbstracts/redp3657.html?Open
Since no announce has been spred to the Keepalived mailing, I
just popup to inform on this event. This paper is very good
with indepth configurations and valuables informations on some
case studies. So just want to say here good work Vic to you
and persons who helped writing this paper.
I would just like to give to the community some kind of comments
and complements on some part of the document.
1) Page.8/Dependencies/ETHTOOL_GLINK : Link media failure detection
As a routing protocol, VRRP need to take a special care to media
link it run on in order to detect low level outage. This state
is not present into the current RFC (we speak about it in the
VRRP IETF WG mailing list, but no change has been done). But
considering coding reality this state is needed. This is why
Keepalived/VRRP code introduce and use an extended VRRP FSM
using the state called "FAULT". This state simply reflect to
the routing daemon the fact that low level is no longer working
which force a VRRP protocol state transition to this FAULT state
until low level is back to job.
To sum the FSM present in Keepalived/VRRP code is :
+---------------+
| |
+--------------->| Fault |<---------------+
| | | |
| +---------------+ |
| | ^ |
| v | |
| +---------------+ |
| +---------->| |<----------+ |
| | | Initialize | | |
| | +------| |------+ | |
| | | +---------------+ | | |
| | | | | |
| | V V | |
+---------------+ +---------------+
| |---------------------->| |
| Master | | Backup |
| |<----------------------| |
+---------------+ +---------------+
To detect link media failure linux kernel provide 3 differents
channel to test for link activity depending how the NIC driver
has been coded. To provide full support to those method, the
strategy present in Keepalived is a timer thread checking
each second link media status on each NIC. To test it use :
o MII probe : test if MII-reg are accessible threw
SIOCGMIIPHY interface, if yes probe for BMSR reg
(cf: donald becker driver homepage for more infos).
o ETHTOOL_GLINK : if MII probe fails then test if ethtool
is supported. If yes test for ETHTOOL_GLINK.
o ioctl reflection : if both previous probe fails then
simply register a SIOCGIFFLAGS to reflect IFF_RUNNING,
IFF_UP, ... (cause kernel netlink code doesn't broadcast
to userspace ifflags updates, or some driver yes and others
not, ... )
The code here assume that Jeff Garzik ETHTOOL is part of the
kernel and doesn't test for it during autoconf stage. So reading
the redpaper, it seems that linux kernel for zSeries doesn't
support GLINK ethtool API this is why compilation fails...
I put this into my todo list to bypass the ETHTOOL_GLINK
probe if it is not supported by kernel.
Also I can read : "the keepalived code compiles cleanly and works
well but experimences a startup delay problem". Please elaborate
this startup delay problem.
2) Page.24/VRRP experiencs on Guest LANs/Virtual MAC address
Since Linux Kernel doesn't support multiple MAC per NIC (I mean
API abstraction). The Keepalived/VRRP code doesn't support
VRRP VMAC. We had a thread on the netdev list a time ago
(cf: http://oss.sgi.com/projects/netdev/archive/netdev.2003-01,
search for "Re: SIOCADDMULTI for unicast broken"). Jamal proposed
a solution based on Traffic Eng at ingress stage... Work need
to be done. BTW, Keepalived/VRRP use Gratuitous ARP strategy
during IP takeover which is quite enought for most of env.
3) Page.28/Failure modes
=> "only a dropped packet or two between...state transition" :
cause no VMAC support gratuitous ARP strategy update remote
routing equipment with new MAC which just broke connected
connection.
=> "Configuring an interface down" : if perform ifdown then
IFF_DOWN is reflected to Keepalived/VRRP code which drive
a state transition to FAULT state. after, ifup called then
IFF_UP is reflected and transition to init state. After
each fallback transition from FAULT to MASTER or BACKUP
Keepalived/VRRP code creaets a new socket and perform a new
multicast ADD_MEMBERSHIP. If mcast address is not registered
it looks like a bug into the zSerie Linux Kernel, an IGMP
issue ...??...
=> "Unplugging a simulated NIC" : Send cable pull event, then
no traffic pass. "Linux still believes the interface is up",
if this pull event is not reflected to linux kernel and
if linux kernel doesn't update ifflags or ethtool_glink
then this break the VRRP coherence, as explained into this
paragraph, as link media failure is not detected by kernel.
And if kernel doesn't detect, it simply can not reflect this
event to any userspace routing daemon, we will have the same
side effect.
=> "AH troubleshooting" : There were a meeting in SF during the
Q1-2003 and VRRP IETF WG stated on this part. They simply
decided to remove IPSEC-AH auth from the RFC. In near futur
this will no longer be part of RFC. OTOH, since lot of
work has been done in Keepalived for this support, I keep it
in the code. But I must warn that this strong auth must be
considered working for simple configuration, instance not
part of a sync_group. I worked to add support to sync_group
but there is some scenario that introduce latency during
IPSEC sequence number synchronization (cf: draft on keepalived
website).
Best regards,
Alexandre
|