LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: PATCH: synchronisation and active/inactive connection count

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: PATCH: synchronisation and active/inactive connection count
From: Wensong Zhang <wensong@xxxxxxxxxxxx>
Date: Thu, 27 Jun 2002 22:41:48 +0800 (CST)

Hi Horms,

On Wed, 26 Jun 2002, Horms wrote:

> 
> I beleive that there is a minor bug in LVS 1.0.3 such that if stale
> information is recieved by the synchronisation thread the
> inactive and active connection counters may become inacurate.
> 
> More specifically, a connection's entry in the hash table
> may change from being marked inactive to active. However the
> active and inactive connection counters for the connection's
> destination are not incremented and decremented accordingly.
> 
> Later, when the connection's entry is removed from the hash table the
> active connection counter will be decremented and the inactive
> connection counter will be lefed unchanged. Thus the former becomes one
> lower than it should be, and the latter remains one higher than it
> should be.
> 

The connection entries created by the synchronization mechanism always 
have their dest server pointer NULL (i.e. cp->dest is NULL). When cp->dest 
is NULL, it will not participate in server active/inactive connection 
counting.

I just checked the ip_vs_sync.c code, and found that it didn't check the
cp->dest (it is a normal connection if cp->dest is not NULL) before
updating the state, it may cause the problem. For example, there are two
pirmary/backup load balancers (lb1 and lb2), first the lb1 is active,
there is a connection created and pointed to the selected server, and the
connection is synchronized to the lb2. Then, the lb1 fails and the lb2
takes over, the connection can continue through the lb2; the lb1 comes
back and works as the backup. Just after the time the connection changes
its state (such as from ESTABLISHED/ACTIVE to INACTIVE), the connection is
synchronized from the lb2 to the lb1. The connection at the lb1 still 
points to the selected server, the directly changing state of this 
connection will make the server active/inactive connection counting not 
correct.

I haven't setup an environment to reproduce this problem. Horms, have you 
experienced the problem in this way?

> This may lead to something along the lines of:
> 
> Prot LocalAddress:Port Scheduler Flags
>   -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
> TCP  vs0:http wlc
>     -> b1:http                      Route   1    4294967296 1
> 
> When in fact the last line should be:
> 
>     -> b1:http                      Route   1    0          0       
> 
> 
> Thought this bug is non-fatal, and unlikely to occur I think that
> it is still worth applying a fix. I have attached a patch which
> should resolve this problem.
> 
> 

Your fix is probably not correct. It updates the cp->dest's
active/inactive connection counters directly, where cp->dest may be NULL 
and it will lead to the NULL pointer dereference.

I perfer that if cp->dest is not NULL and it means that it is not created
by the sync daemon, we are not going to update its state. Please see/test
the attached fix.

Thanks,

Wensong

Attachment: ip_vs_sync.diff
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>