Hi Wensong,
On Thu, Jun 27, 2002 at 10:41:48PM +0800, Wensong Zhang wrote:
> On Wed, 26 Jun 2002, Horms wrote:
>
> >
> > I beleive that there is a minor bug in LVS 1.0.3 such that if stale
> > information is recieved by the synchronisation thread the
> > inactive and active connection counters may become inacurate.
> >
> > More specifically, a connection's entry in the hash table
> > may change from being marked inactive to active. However the
> > active and inactive connection counters for the connection's
> > destination are not incremented and decremented accordingly.
> >
> > Later, when the connection's entry is removed from the hash table the
> > active connection counter will be decremented and the inactive
> > connection counter will be lefed unchanged. Thus the former becomes one
> > lower than it should be, and the latter remains one higher than it
> > should be.
> >
>
> The connection entries created by the synchronization mechanism always
> have their dest server pointer NULL (i.e. cp->dest is NULL). When cp->dest
> is NULL, it will not participate in server active/inactive connection
> counting.
You are right, I forgot to check if cp->dest is NULL which
could cause very bad things to happen.
> I just checked the ip_vs_sync.c code, and found that it didn't check the
> cp->dest (it is a normal connection if cp->dest is not NULL) before
> updating the state, it may cause the problem. For example, there are two
> pirmary/backup load balancers (lb1 and lb2), first the lb1 is active,
> there is a connection created and pointed to the selected server, and the
> connection is synchronized to the lb2. Then, the lb1 fails and the lb2
> takes over, the connection can continue through the lb2; the lb1 comes
> back and works as the backup. Just after the time the connection changes
> its state (such as from ESTABLISHED/ACTIVE to INACTIVE), the connection is
> synchronized from the lb2 to the lb1. The connection at the lb1 still
> points to the selected server, the directly changing state of this
> connection will make the server active/inactive connection counting not
> correct.
>
> I haven't setup an environment to reproduce this problem. Horms, have you
> experienced the problem in this way?
More or less.
You patch appears to work well. Thanks.
Interestingly I note that if a backup linux director takes over a
connection, its connection counters are not updated accordingly.
This is not a big problem, and doesn't cause any negative counts to
crop up, but may be worth fixing at some stage. What are your
thoughts?
--
Horms
|