LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: PATCH: synchronisation and active/inactive connection count

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: PATCH: synchronisation and active/inactive connection count
From: Horms <horms@xxxxxxxxxxxx>
Date: Fri, 28 Jun 2002 15:53:51 +0900
Hi Wensong,

On Thu, Jun 27, 2002 at 10:41:48PM +0800, Wensong Zhang wrote:
> On Wed, 26 Jun 2002, Horms wrote:
> 
> > 
> > I beleive that there is a minor bug in LVS 1.0.3 such that if stale
> > information is recieved by the synchronisation thread the
> > inactive and active connection counters may become inacurate.
> > 
> > More specifically, a connection's entry in the hash table
> > may change from being marked inactive to active. However the
> > active and inactive connection counters for the connection's
> > destination are not incremented and decremented accordingly.
> > 
> > Later, when the connection's entry is removed from the hash table the
> > active connection counter will be decremented and the inactive
> > connection counter will be lefed unchanged. Thus the former becomes one
> > lower than it should be, and the latter remains one higher than it
> > should be.
> > 
> 
> The connection entries created by the synchronization mechanism always 
> have their dest server pointer NULL (i.e. cp->dest is NULL). When cp->dest 
> is NULL, it will not participate in server active/inactive connection 
> counting.

You are right, I forgot to check if cp->dest is NULL which
could cause very bad things to happen.

> I just checked the ip_vs_sync.c code, and found that it didn't check the
> cp->dest (it is a normal connection if cp->dest is not NULL) before
> updating the state, it may cause the problem. For example, there are two
> pirmary/backup load balancers (lb1 and lb2), first the lb1 is active,
> there is a connection created and pointed to the selected server, and the
> connection is synchronized to the lb2. Then, the lb1 fails and the lb2
> takes over, the connection can continue through the lb2; the lb1 comes
> back and works as the backup. Just after the time the connection changes
> its state (such as from ESTABLISHED/ACTIVE to INACTIVE), the connection is
> synchronized from the lb2 to the lb1. The connection at the lb1 still 
> points to the selected server, the directly changing state of this 
> connection will make the server active/inactive connection counting not 
> correct.
> 
> I haven't setup an environment to reproduce this problem. Horms, have you 
> experienced the problem in this way?

More or less.

You patch appears to work well. Thanks. 

Interestingly I note that if a backup linux director takes over a
connection, its connection counters are not updated accordingly.
This is not a big problem, and doesn't cause any negative counts to
crop up, but may be worth fixing at some stage. What are your
thoughts?


-- 
Horms
        


<Prev in Thread] Current Thread [Next in Thread>