On Sat, 23 Dec 2000, Alexandre Cassen wrote:
> I have just publish a little contribution to LVS after negociation with my
> employer. The solution described in the contrib homepage is used in a
> production environnement.
> The projet is named : Keepalived
> The main goal of the keepalived project is to add a strong & robust
> keepalive facility to the Linux Virtual Server project. This project is
> similar to the MON project, but it is in C with multilayer TCP/IP stack
> checks. Keepalived implements a framework based on three family checks :
> Layer3, Layer4 & Layer5. This framework gives the daemon the ability of
> checking a LVS server pool states.When one of the server of the LVS server
> pool is down, keepalived informs the linux kernel via a setsockopt call to
> remove this server entrie from the LVS topology.
> the project homepage is : http://keepalived.sourceforge.net
> Hope it will help,
Some thoughts on this topic:
1. If the checks are moved from the director to the real servers you
can relax the CPU activity in the director. If this does not sounds as
a problem for small clusters consider a setups with many virtual services
and with many real servers. The director is going to waste CPU cycles
only for checks. Not fatal for LVS (the network always has CPU cycles)
but for the checks.
2. Don't copy the same work from the hardware solutions, most of them
can't run agents in the real servers and implement checks in different
layers. What information needs the director: only whether the real
service is working properly and the weight to assign to this real service.
So, we can start agent in the real server and to register for these
checks there: "First: let me know ASAP when this service is not working
properly. Second: send me your weight on each 3 seconds". Then
the director needs only to wait for the information. If the information
is not received in specified time, start some checks, set the weight
to 0, etc. For example, why we need L3 checks when we see that our agent
feed us with weights or keepalive probes in time. The L3 checks can
enter into the game when we try to determine whether the agent is
running in the real server. May be after performing L4/L7 check to our
agent. When we enter the notifications and alerts in this game we
come to this conslusion: I don't wait pages for each real service
fail because when the kernel is crashed I need only this info, i.e.
only one wakeup with the L3 status information: "L3 failed for RS1".
I don't like "L7 failed for DNS in RS1" followed by "L7 failed for
FTP in RS1", ... So, we can run L4 or L7 checks and when they fail
we can try with lower layer checks to determine where is the problem.
If we are sure that our agents are working properly we are sure that the
real services are working properly. When the admin stops all real
services, for example, all httpd daemons, we prefer again only one
page (if the notifications are not blocked in this process of
real service management): "Cluster X is down" instead of 10 pages
for "httpd in WEB1 failed", "httpd in WEB2 failed", etc. May be
we need a way to block the notifications or the cluster in the director.
There are other options: instead of registering for the
weights we can register for load information and to use it in the
director to set the weight based on expression from these parameters:
one expression for FTP, another for HTTP, etc
3. Of course, there are other ways to set the weights - when they
are evaluated in the director. This can include decisions based on
response times (from the L7/4/3 checks), etc. Not sure how well they
are working. I've never implemented such risky tricks.
4. User defined checks: talk with the real service and analyze these
5. NAT is not the only used method. The DR and TUN methods don't allow
the director's checks properly to check the real services: the real
service listens to the same VIP and it is hard to generate packets
in the director with daddr=VIP that will avoid the routing and will
reach the real server. They don't leave the director. What means this:
we can't check exactly the VIP:VPORT in the real service, may be only
RIP:VPORT ? This problem does not exist when the checks are performed
from the real service, for example the L4 check can be simple bind()
to VIP:VPORT. Port busy means L4 succeeds. No problems to perform
L7 checks. Sometimes httpd can listen to many virtual domains with
bind to 0.0.0.0. Why we need to perform checks for all these VIPs
when we can simply check on of them. Many, many optimizations, User
6. Some utilities or ioctls can be included in the game: ip,
ipchains or the ioctls they use. This allows complex virtual services
to be created and to support the fwmark-based virtual services.
7. Redundancy: keepalive probes to other directors, failover times,
This can be a very very long discussion :)
> Happy christmas and happy new year,
> Alexandre Cassen
Julian Anastasov <ja@xxxxxx>