Re: VRRP and the kernel

To: Alexandre CASSEN <alexandre.cassen@xxxxxxxxxxxxxx>
Subject: Re: VRRP and the kernel
Cc: <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Fri, 23 Nov 2001 13:53:53 +0200 (EET)

On Fri, 23 Nov 2001, Alexandre CASSEN wrote:

> The problem : Linux kernel doesn't permit to work with many MAC on a single
> NIC (only one at a time)
> For the moment we have some way to solve this problem :
> 1. Do not handle VMAC just send gratuitous ARP during VRRP VIP takeover

        So, it is not recommended for use because we don't reply to
valid ARP probes with our VMAC?

> 2. Use a userspace ARP daemon replying ARP request => At this point we
> think that userspace is probably not the rught place to handle this
> issue...
> 3. Aplpy a kernel space patch to reply ARP requests for VRRP VIP => so we
> will be able to handle as many VMAC as VRRP Virtual Router => That way we
> will be RFC compliant.
> 4....

        Is it VRRP really working with LVS because I see that LVS
works only with PACKET_HOST packets. What I understand from the kernel
sources is that by using VRRP we will receive packets to VIP marked
as skb->pkt_type = PACKET_MULTICAST (due to the VRRP VMAC prefix) but
rt_type = RTN_LOCAL (VIP is local). I don't see how LVS will forward
PACKET_MULTICAST packets? Can you comment? May be because we still
don't ARP reply with VMAC? May be we have to explicitly allow LVS
to forward PACKET_MULTICAST but not to forward MULTICAST(VIP)?

        As for ARP replying with VRRP MAC, if the above is true, then
we can add one check in arp.c:arp_rcv where we can check for PACKET_MULTICAST
but then we have to decide how we can keep the VIP->VRRP_MAC association
for each input device and to lookup for VMAC in arp_rcv. It can be
global because it is possible many devices attached to same hub to
receive the remote ARP probe. With some rp_filter/arp_filter protection
we will avoid multiple equal ARP replies "VIP1 is-at VMAC1" but this is
different issue. I prefer we to keep the VIP-VMAC table device independent
or we have to duplicate the VIP->VMAC list for each device. Because we
can receive ARP probe "who-has VIP1 tell UNIQUE_IP" from many devices
and we have to send same reply through all of them. This is different
from the normal behavior for PACKET_HOST IPs because we send the ARP
replies always with the device's MAC. But replying with same VMAC
through all devices can cause problems with the switch?

        If we reply through each device with a different VMAC for
same IP then we can end up with "ARP race": the remote hosts will
see how each VIP changes frequently its MAC because all these devices
in our VRRP router are attached to same hub. Nothing different from
the current behavior. So, it seems each device needs its own VMAC
and then a global VIP->VMAC table or may be this is not possible.
What is the current state in keepalived? I remember something for
different instances per device but how is that related to the ARP?
Do we know with what MAC we should send our ARP reply considering
the requested IP and the input device where the ARP probes was

> >> Yes ARP replies... In fact currently I use simple gratuitous ARP to
> update
> >> remote caches... but during IP takeover using this technic I have a TTL
> >> expiration... Do you think that gratuitous ARP using the real NIC and
> >
> >    Which TTL expires?
> Probing the current VRRP implementation during IP takeover when I let a
> ping on a VRRP VIP (on a third party workstation). When takeover appear, I
> have TTL expiration (I do not understand really why...)... no packets are
> lost but IP takeover introduce this strange TTL expiration (probably due to
> gratuitous ARP to update cache, or my switch, ...). If I use VMAC (one at a
> time), no TTL expiration... because MAC address still the same for the VIP
> takeover.

        I assume the case is that you receive ICMP_TIME_EXCEEDED with
ICMP_EXC_TTL from some host? Then it seems nobody wants to accept this
packet locally and it loops between two routers? Is that the case
considering your routing topology? When TTL reaches 1 and one of
the routers replies to you with ICMP?

> => Gratutitous ARP are not really needed during takeover... only if we are
> using switch... need to update switch CAM table (VMAC1 change from switch
> port1 to switch port2 for example).

        Agreed. But may be it is useful to update the expiration timers
for the remote hosts' ARP entries (I don't know how much takes the
failover, may be they will mark the VIP as staled, this can be bad
for setups with passive dead gateway detection).

> Best regards,
> Alexandre


Julian Anastasov <ja@xxxxxx>

<Prev in Thread] Current Thread [Next in Thread>