On Tue, Oct 30, 2007 at 08:56:36AM +0200, Rumen Bogdanovski wrote:
>
> On Tue, 2007-10-30 at 15:21 +0900, Simon Horman wrote:
> > On Tue, Oct 30, 2007 at 02:22:10AM +0200, Rumen Bogdanovski wrote:
> > > Hi all,
> > > I just saw a patch for the proper timeout set at the backup for the
> > > received connections, well I want to share my opinion, since I have been
> > > messing with the very same code last several days.
> > >
> > > I think setting the timeout to 3 minutes for all received connections
> > > has a very good reason.
> > >
> > > IMHO setting a timeout of [IP_VS_TCP_S_ESTABLISHED] = 15*60*HZ is wrong
> > > since AFAIK there is no way for the master to inform the backup if a
> > > connection is closed or fin_wait or whatever. Connection sending is
> > > based on packet count, isn't it? So imagine A TCP connection lasting 3
> > > seconds which is going to hang on the backup for 15 more minutes. Now
> > > imagine 1000 connections lasting several seconds on the master hanging
> > > for 15 minutes on the backup. I think this timeout should be kept
> > > reasonably low to keep minimal number of hanging connections and
> > > reasonably high not to timeout until next update.
> > >
> > > However if the backup takes over it will set the proper timeouts as
> > > defined in "static int xxx_timeouts[IP_VS_XXX_S_LAST+1]" for all the
> > > connections.
> > >
> > > Well I might be wrong, but I just wanted point the attention of the
> > > people who know how everything works to this potential problem :)
> >
> > You are right that increasing the timeout will likely result in an
> > increased number of connections on the backup linux-director. But I'm
> > not entirely convinced that this is a problem as such. Not from a
> > memory point of view anyway. The connection entries themselves are very
> > small (~116 on i386) and even if you have millions of them its
> > still not a lot of memory.
> >
> > In any case, this is just about changing the default value to something
> > that I believe is a bit more sane. As people have found problems
> > with the current default. I'm still open to the idea of the default
> > being configurable on the backup linux director and/or transmitted
> > via the synchronisation problem.
>
> Well with this value and my patch restricting connection number on the
> backup according the u_threshold it is quite likely to end up in 15
> minutes connection refusal on the backup if a failover occurs, while the
> old ones are handled, but still depends, for some services it may simply
> mean 15 minutes downtime for no reason. It strongly depends on the
> service which timeout is acceptable. So user configurable timeout sounds
> reasonable. It should not be that difficult to make it configurable at
> daemon start up just like the syncid and syncinterface.
Sure, lets make it so.
> > If however there are more serious problems created by having connection
> > entries lying around on the standby server that have been closed on
> > the master server, then thats probably not a problem that can be changed
> > by twiddling the timeout. Its probably a more fundamental problem.
> >
>
> Well is there any way to make a sunync on any state chagne of the
> connection?
It is possible, though I'm not entirely convinced it would be a good
idea.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
|