(CC'ing my coworker)
On Thu, 23 Feb 2006, Joseph Mack NA3T wrote:
> >1) It seems like Linux does not ARP as often as I expect it to.
> by watching tcpdump, my machines send arp requests about
> every 90sec. TCP/IP illustrated (Stevens), v1, section 4.5
> says that Berkeley derived implementations (which Linux used
> to be) have timeouts of 20mins. I guess it's implementation
I was under the same impression, which is the reason why I was so
puzzled initially. But I've learned a bit more about the
situation (see below).
> >I thought no matter how much communication happens between
> >two hosts, there usually is an ARP request (who-has) every
> >five minutes. But in the case of keepalived, which checks
> >the Realservers, I see no ARP requests at all, just normal
> >communication. If I flush the arp cache for an IP, there's
> >one ARP request (and reply), then no more.
> presumably it gets replies? (and then is happy)
It doesn't, but things work out differently than I initially
thought (again, see below).
> >Does normal TCP communication "refresh" the ARP table
> no it's a separate layer.
That's what I expected.
> >There are no hardcoded ARP entries in my setup.
> >The logical network setup is like this:
> >[[RSs]] <-------- [LB] <-------- [client]
> > | VLAN A VLAN B ^
> > | |
> > `--------------------------------'
> >I see no ARP requests on VLAN A. I.e. the OS on LB learns the MAC
> >address of the RSs once, then keeps that knowledge forever.
> >I realize that this *probably* isn't an IPVS problem in and of
> >itself, but haven't seen it anywhere else.
> I assume you're asking if what you see is OK. It looks OK to
Trouble is: it leads to unicast floods in our switch/router
Let me explain our setup in more detail.
LB has one administrative IP on VLAN B (say: 126.96.36.199). Our
routers is setup to forward all traffic to the service IP(s) to
the LBs interface on that VLAN. On the interface with VLAN A, the
LB has an Ip of 192,68.0.1. The realservers share no IP net with
the LB, i.e. they have non-RFC1918-IPs. Keepalived is configured
to do its checks with the IP from VLAN B (188.8.131.52) but send them
via the interface in VLAN B. The Realservers answer these packets
via their default gateway (not via the LB's interface in VLAN A).
As we deactivated rp_filter, this works out okay. This has the
huge advantage of not wasting an IP for every Farm/RS-VLAN the LB
is responsible for. Also, we would have to renumber about 330
machines as their IP nets have no room for more IPs (initially we
used a different kind of load balancing and now we have to live
with this limitation). Thsi is due to the fact that we'd like to
be able to mix and redistribute farms over our ten LBs as easily
as possible. Thus, we'd need ten IPs in every RS-Net for the LBs
The disadvantage is of course that the RSs never ARP for the LB's
MAC because they reach it via a Router (their default gateway).
This way, the switch the RS is connected to never learns the MAC
for the LB's interface in VLAN A.
If we can't change it, we'll have to either setup an arping
solution (very hackish), waste a lot of IP space or we won't be
able to use asymetric load balancing (as badly diddled above).
If anyone has a better solution (no IP wasting and no unicast
storms), I'm glad to hear it. And no, static ARP entries on the
routers are *not* a solution.
> >2) When I last tested IPVS on 2.6, I sometimes saw "stuck"
> >connections. The LB had connections in its counters (and probably
> >in the IPVS conntrack table, too) that expired ages ago. Even
> >days or weeks after the last packet for such a farm was sent to
> >the LB, they were still there. Sometimes, even deleting the
> >entire farm and adding it again didn't help. While the farm is in
> >use, the number of those connections steadily increases. Is this
> >problem known (and/or fixed)?
> are you using persistence? If so, this can be part of the
> behaviour - look at the lengthy section in the HOWTO on
> persistence and how to kill connections.
We use no persistance on these machines at all (we have a
seperate load balancer setup for farms who need it).
And if it were persistence, the counters should drop to zero if I
completely remove a farm and re-create it, right? They don't.
As I see it, those connections should be reaped either after the
balancer sees half of the TCP teardown (it can't see the whole
teardown) _or_ expire after an amount of time.
You don't need eyes to see, you need vision.