[lvs-devel] Sync connection timeout patch by Andy Gospodarek

Subject: [lvs-devel] Sync connection timeout patch by Andy Gospodarek
From: rumen at (Rumen Bogdanovski)
Date: Tue, 30 Oct 2007 08:56:36 +0200
On Tue, 2007-10-30 at 15:21 +0900, Simon Horman wrote:
> On Tue, Oct 30, 2007 at 02:22:10AM +0200, Rumen Bogdanovski wrote:
> > Hi all,
> > I just saw a patch for the proper timeout set at the backup for the
> > received connections, well I want to share my opinion, since I have been
> > messing with the very same code last several days.
> > 
> > I think setting the timeout to 3 minutes for all received connections
> > has a very good reason.
> > 
> > IMHO setting a timeout of [IP_VS_TCP_S_ESTABLISHED] = 15*60*HZ is wrong
> > since AFAIK there is no way for the master to inform the backup if a
> > connection is closed or fin_wait or whatever. Connection sending is
> > based on packet count, isn't it?  So imagine A TCP connection lasting 3
> > seconds which is going to hang on the backup for 15 more minutes. Now
> > imagine 1000 connections lasting several seconds on the master hanging
> > for 15 minutes on the backup. I think this timeout should be kept
> > reasonably low to keep minimal number of hanging connections and
> > reasonably high not to timeout until next update. 
> > 
> > However if the backup takes over it will set the proper timeouts as
> > defined in "static int xxx_timeouts[IP_VS_XXX_S_LAST+1]" for all the
> > connections.
> > 
> > Well I might be wrong, but I just wanted point the attention of the
> > people who know how everything works to this potential problem :)
> You are right that increasing the timeout will likely result in an
> increased number of connections on the backup linux-director. But I'm
> not entirely convinced that this is a problem as such. Not from a
> memory point of view anyway. The connection entries themselves are very
> small (~116 on i386) and even if you have millions of them its
> still not a lot of memory.
> In any case, this is just about changing the default value to something
> that I believe is a bit more sane. As people have found problems
> with the current default. I'm still open to the idea of the default
> being configurable on the backup linux director and/or transmitted
> via the synchronisation problem.

Well with this value and my patch restricting connection number on the
backup according the u_threshold it is quite likely to end up in 15
minutes connection refusal on the backup if a failover occurs, while the
old ones are handled, but still depends, for some services it may simply
mean 15 minutes downtime for no reason. It strongly depends on the
service which timeout is acceptable. So user configurable timeout sounds
reasonable. It should not be that difficult to make it configurable at
daemon start up just like the syncid and syncinterface.

> If however there are more serious problems created by having connection
> entries lying around on the standby server that have been closed on
> the master server, then thats probably not a problem that can be changed
> by twiddling the timeout. Its probably a more fundamental problem.

Well is there any way to make a sunync on any state chagne of the

<Prev in Thread] Current Thread [Next in Thread>