Re: Our LVS/DR backends freezes

To: =?x-unknown?q?Olle_=D6~Vstlund?= <olle@xxxxxxxxxxx>, Horms <horms@xxxxxxxxxxxx>
Subject: Re: Our LVS/DR backends freezes
Cc: " users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Joseph Mack NA3T <jmack@xxxxxxxx>
Date: Tue, 28 Nov 2006 06:49:11 -0800 (PST)
On Tue, 28 Nov 2006, Olle ~Vstlund wrote:

This is our default ulimit -a:

seems normal enough. Anyhow it would be the same whether it was a standalone server or a realserver in an LVS.

They shouldn't slowly drop.

All connections should be gone in the time the clients are
connected + FIN_WAIT. How long does a client hold a tomcat
connection? seconds, minutes, hours?

Hmmmm. This is an area I'm not very confident about. The fact is that
the ipvsadm-reported "active connections" do drop very slowly, and the
"inactive connections" seems never to drop. I thought this was a result
of having the ldirectord/ipvsadm "quiescent" attribute set to true.

      quiescent = [yes|no]

      If yes, then when real or failback servers are determined to be
      down, they are not actually removed from the
      kernel's LVS table. Rather, their weight is set to zero which
      means that no new connections will be
      accepted. This has the side effect, that if the real server has
      persistent connections, new connections from
      any existing clients will continue to be routed to the real
      server, until the persistant timeout can expire.
      See ipvsadm for more information on persistant connections.

      If no, then the real or failback servers will be removed from the
      kernel's LVS table. The default is yes.

Exactly what is meant by "the persistant timeout can expire" I don't
know? What persistance?

you will have persistence if you used the -p option to ipvsadm when setting up the virtual service (see with `ipvsadm` - I don't see it in the output below).

I have tried to find a corresponadance to the ipvsadm "active
connections" and "inactive connections" numbers on the realservers, but
things do no match up. My conclusion is the the ipvsadm-figures are
strictly a memory of figures for the load-balancing algoritms to work
with, and that they does not correspond to realtime/real-world figures
of actual connections.

yes they are estimates based on a table of timeouts held by the director. The director doesn't see the CLOSE from the realserver in LVS-DR - it goes directly to the client. In LVS-NAT the ActConn/InActConn are accurate.

If ipvsadm reports 700 "active connections" to a
realserver, netstat on the realserver typically reports less than half
the figure (netstat -t -n | wc -l ==> 227 connections).

This is a typical output from ipvsadm:

IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddress:Port Scheduler Flags
 -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP wrr
 ->             Route   10     722        32908
 ->             Route   10     793        34342
TCP wrr
 ->               Route   10     43         2660
 ->               Route   10     47         2510
TCP wrr
 ->              Route   10     0          4
 ->              Route   10     0          4

Where are you measuring the number of connections? with
ipvsadm on the director or with netstat on the realserver?

I was refering to ipvsadm-figures.

This seems an large number of InActConn for the number of ActiveConn. Once you change the weight of the realserver to 0, the InActConn should all go away in FIN_WAIT (about 2mins). Does this happen?

If the client/server holds the connection, then the ActiveConn will not drop till they disconnect. If InActConn is behaving properly, the above numbers show that the connection time for the client is 2min*(722/32908) = 2 secs. This seems short: the server (I thought) held the connection open incase the client makes a subsequent request.

Is this a problem of the director not seeing the CLOSE from the client?


Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at
Homepage It's GNU/Linux!

<Prev in Thread] Current Thread [Next in Thread>