[lvs-users] Expired LVS connections cause hanging connections on real se

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: [lvs-users] Expired LVS connections cause hanging connections on real server
From: Erik van Pienbroek <erik@xxxxxxxxxxxxxxx>
Date: Fri, 09 Dec 2011 15:13:44 +0100

We are running a LVS environment which is based on RHEL 5.3 and contains
two real servers which are providing LDAP services using Red Hat
Directory Server.

We are currently having an issue where TCP connections which are
forwarded from the LVS server to the real servers remain stuck in the
ESTABLISHED state after some time and will never be cleaned up. These
hanging connections are observed on the real servers. The connections
are cleaned up properly on the client (netstat) and the LVS server
(ipvsadm --list). Because of this behavior we'd have to restart the LDAP
server periodically to make sure it doesn't run into a 'too many open
files' situation.

After investigation we've found out that this issue seems to be caused
by the expire mechanism used by LVS. Once a connection from a client to
a real server (redirected by LVS) remains idle for 15 minutes it will
automatically be dropped from the LVS connection pool. However, no TCP
FIN packets are sent by LVS to properly abort the connection.

As the clients frequently try to perform LDAP requests the client will
automatically get a TCP RST packet once it tries to send data to the LVS
server on a connection which was just expired. In this situation the
client cleans up the connection on its side and tries again.

However, the LDAP server itself never sends any data on its own to a
connected client. It will only respond to client requests. So in that
situation the LDAP server will never find out that the connection was
expired by the LVS server and will happily assume that the connection is
still there. This will cause the LDAP server to run out of open file
descriptors after some time because expired connections are never
cleaned up.

Now I've seen there are various workarounds for this situation which can
be applied on the LDAP servers (like enabling an idle timeout in the
LDAP server itself or enabling TCP keepalive), but to me the real cause
is LVS as it is the component which decides to drop open connections
without proper notifications to both sides (the TCP FIN packet).

Is it possible to have LVS automatically send TCP FIN packets (or
something similar) to both sides of a connection before dropping it from
the connection pool after an expire timeout?

Is it really required to enable TCP keepalive in services running on the
real servers to avoid having connections remain stuck in the ESTABLISHED

Kind regards,

Erik van Pienbroek

Please read the documentation before posting - it's available at: mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to

<Prev in Thread] Current Thread [Next in Thread>