Hello,
This is a response by myself.
> > (4)And then recover the NIC(eth0) of RS2 intentionally by executing manually
> > "/etc/init.d/network restart".
> > After a while, LB1 starts sending http packets to RS1 and RS2 in spite of
> > still weight 0 of RS2. Moreover, LB1 is sending the packets to RS2 much
> > less than RS1.
> > (This strange behavior continues permanently. So I think the cause of
> > the behavior isn't always in a retransmit process of TCP Layer.
> > In fact, the strange behavior stops when i stop the high load from CL1)
>
> Checking by "ipvsadm -Lc", there are many TIME_WAIT states,
> it seems that InActConn number is reflected them.
> By the way, refering to ip_vs source code (ip_vs_proto_tcp.c),
> IP_VS_TCP_S_TIME_WAIT is 2*60*HZ.
> When i changed IP_VS_TCP_S_TIME_WAIT 2*60*HZ to 10*Hz etc (much smaller
> than 2*60*Hz), i think it seems to be improved the strange behavior.
I tried to set IP_VS_TCP_S_TIME_WAIT(default:2*60*HZ) to 1*Hz extremely.
As I expected, InActConn wasn't increasing to maximum,
and it seems to be able to escape the trouble.
But I think that IP_VS_TCP_S_TIME_WAIT(1*Hz) maybe not realistic value.
So I'd like to find out another realistic resolution method and
I wonder Malcolm's advice(using expire_nodest_conn) is useful and
a similar resolution is better.
By the way, which parameter (or value) in LVS source-code is
maximum of InActConns and ActiveConns ?
>
> Is IP_VS_TCP_S_TIME_WAIT related with the cause of the trouble ?
> i think some timers in LVS are related with the behavior ...??
I'm sorry for my many questions and comments.
Thanks a lot.
--
Hideaki Kondo
|