Re: keepalived (was Re: News contrib to LVS)

To: ja@xxxxxx
Subject: Re: keepalived (was Re: News contrib to LVS)
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Alexandre Cassen <Alexandre.Cassen@xxxxxxxxxx>
Date: Tue, 26 Dec 2000 23:11:36 +0100

> the project homepage is :

        Some thoughts on this topic:

1. If the checks are moved from the director to the real servers you
can relax the CPU activity in the director. If this does not sounds as
a problem for small clusters consider a setups with many virtual services
and with many real servers. The director is going to waste CPU cycles
only for checks. Not fatal for LVS (the network always has CPU cycles)
but for the checks.

Sure, but in my mind people who run big LVS infrastructure can run the whole solution on a director with appropriate CPU. Big director solution are chip today. But it can weaken the network performances it is true (multiple tests like SSL checks can act that way too with CPU...). So we can imagine a solution where the director solution is a cluster of two server :
1. One server for the VS gestion using the ipvs kernel module
2. The second for performing the keepalived checks triggers. This server will communic via socket with the first server to pass add/remove realserver from the pool.

=> In this solution we need to implement a communication composant that listen for he ipvs director. => We can also imagine that when the ipvs director break down, a daemon like hearthbeat moves the ipvs director functionnality on the keepalived server.

I am using Arrowpoint loadbalancer at work (CS50), and they perform triggers checks like this on each loadbalancer. For administrators, i think it is a good design to locate the keepalived functionnality. If the CPU is not so strong, we can also create, using LVS, a virtual server with a cluster of keepalived server. This can be a good
design too i think.

2. Don't copy the same work from the hardware solutions, most of them
can't run agents in the real servers and implement checks in different
director to set the weight based on expression from these parameters:
one expression for FTP, another for HTTP, etc

That you discribe here is the way like BMC BEST/1 or PATROL or other monitoring platform work. For me adding an agent on each server multiplicate administration task and introduce security vulerabilities (i probably mistake... :) ).

If we do not want to depend on the plateform the realserver service run we need to centralize the check triggers to the loabalancer or a single point check. A monitoring environnement based on a couple of collector/monitoring console are extremly OS dependent. In a really first realse of keepalived I had used monitoring agent based on a simple protocole frame to communicate with a centralized monitoring tools. But my environnement is really eterogeneous (Oracle OAS, IIS, Netscape, Apache in the same realserver pool), so to factorise a limite the OS dependent dev I have emplemented a design centralized to a single point using network scanning technic to perform check.

3. Of course, there are other ways to set the weights - when they
are evaluated in the director. This can include decisions based on
response times (from the L7/4/3 checks), etc. Not sure how well they
are working. I've never implemented such risky tricks.

Yes !!! :) response time and the ability to check application performance is a great and VERY interresting functionnality that we can add to such daemon. We can use a dynamic structure registering statistics about each server response time... if the response time decrease or change, we can modify the cluster performance and made him fully dynamic on hte applications performance. We can so define here a "weighted performance" variable like the LVS weight. We can also use some great fairequeing functionnality that is present in the advanced routing functionnality to adjust ip stream using kernel call to QOS framework.... really a good think to do here :)

4. User defined checks: talk with the real service and analyze these

A macro language definition ... a small language to define checks and use hardcoded primitives (tcpcheck, httpget, ...) to define action on result...

5. NAT is not the only used method. The DR and TUN methods don't allow
the director's checks properly to check the real services: the real
service listens to the same VIP and it is hard to generate packets
in the director with daddr=VIP that will avoid the routing and will
reach the real server. They don't leave the director. What means this:
we can't check exactly the VIP:VPORT in the real service, may be only
RIP:VPORT ? This problem does not exist when the checks are performed
from the real service, for example the L4 check can be simple bind()
to VIP:VPORT. Port busy means L4 succeeds. No problems to perform
L7 checks. Sometimes httpd can listen to many virtual domains with
bind to Why we need to perform checks for all these VIPs
when we can simply check on of them. Many, many optimizations, User

6. Some utilities or ioctls can be included in the game: ip,
ipchains or the ioctls they use. This allows complex virtual services
to be created and to support the fwmark-based virtual services.

Yes it is in my focus : adding multiple kernel functionnality wrappers... for fwmark, qos, ...

7. Redundancy: keepalive probes to other directors, failover times,
takeover decisions

        This can be a very very long discussion :)

Of course yes !!! :))) I think many interresting things .... do not give an 8. point otherwise i will not stop coding !!!! :))



<Prev in Thread] Current Thread [Next in Thread>