Re: load balancing trouble at a high load

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: load balancing trouble at a high load
From: Hideaki Kondo <kondo.hideaki@xxxxxxxxxxxxx>
Date: Fri, 26 May 2006 13:56:34 +0900

This is a response by myself.

> > (4)And then recover the NIC(eth0) of RS2 intentionally by executing manually
> >    "/etc/init.d/network restart".
> >    After a while, LB1 starts sending http packets to RS1 and RS2 in spite of
> >    still weight 0 of RS2. Moreover, LB1 is sending the packets to RS2 much
> >    less than RS1.
> >   (This strange behavior continues permanently. So I think the cause of 
> >    the behavior isn't always in a retransmit process of TCP Layer.
> >    In fact, the strange behavior stops when i stop the high load from CL1)
> Checking by "ipvsadm -Lc", there are many TIME_WAIT states,
> it seems that InActConn number is reflected them.
> By the way, refering to ip_vs source code (ip_vs_proto_tcp.c),
> When i changed IP_VS_TCP_S_TIME_WAIT 2*60*HZ to 10*Hz etc (much smaller
> than 2*60*Hz), i think it seems to be improved the strange behavior. 

I tried to set IP_VS_TCP_S_TIME_WAIT(default:2*60*HZ) to 1*Hz extremely.
As I expected, InActConn wasn't increasing to maximum,
and it seems to be able to escape the trouble.

But I think that IP_VS_TCP_S_TIME_WAIT(1*Hz) maybe not realistic value.
So I'd like to find out another realistic resolution method and
I wonder Malcolm's advice(using expire_nodest_conn) is useful and
a similar resolution is better.

By the way, which parameter (or value) in LVS source-code is
maximum of InActConns and ActiveConns ?

> Is IP_VS_TCP_S_TIME_WAIT related with the cause of the trouble ?
> i think some timers in LVS are related with the behavior ...??

I'm sorry for my many questions and comments.
Thanks a lot.

Hideaki Kondo

<Prev in Thread] Current Thread [Next in Thread>