Hello Wensong,
This is good. A RIP or a group of RIPs which act like a cold standby pool for
session overload. The schedulers would not select a member of the pool (flagged
with IP_VS_OVERFLOW_SERVER or something similar) unless all members of a VIP
will have the IP_VS_DEST_F_OVERLOAD flag set. Then schedulers will start using
the IP_VS_OVERFLOW_SERVER pool.
Yes, this is the way to go. :)
Forgot one thing: As soon as IP_VS_DEST_F_OVERLOAD is removed from one RIP the
overflow pool will not be taken anymore. As I have running such a setup with
various customers I can tell you from experience that you can run into two
problems here which I haven't solved yet:
1. Problem: RIP pool <---> overflow pool pingpong with long lasting peak load
You're having a peak load which results in the pool being activated. Now, due
to a lower treshold which is close to the upper treshold, one or two of the
RIPs come back fast again but the peak load is not over. You will end up in
a RIP pool <---> overflow pool pingpong until the peak is flattened out
again.
2. Problem: persistency inheritance on overflow pools yields bad behaviour for
monitoring sites.
Imagine you set up a service which uses persistency. Now in the current case
the overflow pool servers will inherit the persistency flag. This is of
course extremely bad when you monitor a site. Why? Because if you monitor
with a 1 minute interval and the session template timeout is higher than 1
minute you will get a session overload page even though all RIPs will already
be back functional. This is my biggest problem in 2.2.x righ now :)
1. Solution:
We can say that we don't care and that this is a rather high hypothetical
case and do nothing about it. Or we include a means by which you can say
when or better starting from how many RS need to be back until we switch
to the RIP pool again.
2. Solution:
I think the only solution to this is to have no inheritance on the overflow
pool servers when it comes to persistency. I would even go as far as not to
allow persistency for overflow servers.
Another possibility is that the persistency flag is set through inheritence
but is generally discarded with some settable flag.
What do you think?
I see your method in 2.2.x, but it is not good to change weight directly
inside the kernel. ;)
It works for me :), but I assume you're right. I had to do something to please
our customers or they would have switched back to a Cisco load balancer. Now I
have to solve the persistency inheritence problem in 2.2.x for my scheduler.
Unfortunately the persistency is not scheduler related so I need to fiddle with
the core.
Take care,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|