Re: IPVS Health Checking Best Practices

To: Alex Gartrell <agartrell@xxxxxx>, lvs-devel@xxxxxxxxxxxxxxx
Subject: Re: IPVS Health Checking Best Practices
Cc: dsp@xxxxxx, kernel-team <Kernel-team@xxxxxx>, ps@xxxxxx
From: Alexey Andriyanov <alan@xxxxxxxxxx>
Date: Fri, 19 Sep 2014 08:31:05 +0400
Hi, Alex.

We use tunnel checks to the host itself. The encapsulated packet looks like 
CHK_SRC -> RS1 [ (IPIP) CHK_SRC -> VIP1 ].

This could be done without maintaining iptables rules or tunnel interfaces.
You simply apply fwmark corresponding to a proper RS to checker socket via 
SO_MARK. Then direct all marked packets in OUTPUT to the NFQUEUE, and 
encapsulate a packet in user-space, selecting tunnel endpoint based on fwmark.

We use keepalived + this little tool for that:

19.09.2014 01:26, Alex Gartrell wrote:
> Hello All,
> Today, we run IPVS on a number of hosts.  Each of these hosts has a python 
> process responsible for ensuring the health of pool members and then updating 
> their weights as necessary.
> We do these health checks via IPVS for two reasons:
> 1) Different VIPs have different listeners on our real servers, so we can't 
> just use the regular host address
> 2) We want to ensure that decapsulation is happening appropriately.
> The way we do this today is a giant hack.  We have a scheduler that we've not 
> (yet) open sourced that does consistent hashing, and someone just wired in a 
> couple additional sysctls that will allow you to do the following:
> If a request is from $MAGIC_IP and the source port is >= $MAGIC_PORT, then 
> send it to pool->members[($SRC_PORT - $MAGIC_PORT) % $N].
> I'd like to solve this problem more generally.
> The other solution I've heard of is using fwmarks, but that kind of sucks 
> from a configuration perspective (because you have to add in all of the 
> persistent vips and everything).
> Here are some other ideas:
> 1) Map the socket itself to a particular pool with a netlink invocation or 
> something
> 2) Provide a way to bind specific src addr, port tuples to specific 
> destination (though this is a bummer because you have to reserve port space)
> But I'm completely open to ideas and I think we're willing to do the work to 
> make this happen.
> Thanks,

Best regards,
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at

<Prev in Thread] Current Thread [Next in Thread>