LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

[PATCH] Current weight not resetting when updating

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: [PATCH] Current weight not resetting when updating
Cc: Joseph Mack <mack.joseph@xxxxxxx>
From: Kees Hoekzema <kees@xxxxxxxxxxxx>
Date: Thu, 3 Feb 2005 22:25:10 +0100
Hi Joseph,

On Wednesday 02 February 2005 06:09, Joseph Mack wrote:
>  It seems you've found a bug here. Hopefully someone will look into it.
It seems a bug to me too, i did some additional debugging;
the weighted round robin algorithm decides if it should use a server based 
on the 'current weight' (cw). But if you edit the weight of the 
server /or/ delete a server the current weight is not changed. 

An example:
I have two servers, one with a weight of 100 and one server with a initial 
weight of 25 (lets say; one is a quad cpu system, the other is a single 
cpu system). Unfortunately the quad cpu system fails after the first 
request and is taken out of service, so we are stuck with just one server 
with a weight of 25.

The next request comes in; the current weight is lowered by 25 and is now 
75. But the server has a weight of 25 so no destinations are available and 
null is returned resulting in a connection refused for the request. The 
next requests lowers the current weight to 50, again connection refused, 
the next request will finally get a success. 

This is the same if for deleting or editing a server (changing its weight). 
Two out of four requests fail, mozilla will give the user an error with 
'page contains no data', which is highly annoying. In a real situation it 
wont happen very often, but with millions of visitors (and in my case; 
changing weights every 10-15 seconds) it will happen a couple of times a 
day, and people are going to complain.

With the patch below for ip_vs_wrr.c the current weight is resetted to zero 
in case of an update. If a server is deleted or updated the current weight 
is zero and will be set to the maximum weight (which /is/ updated at each 
update) after the first iteration with the following code:

ip_vs_wrr.c:
line
163: if (mark->cw <= 0) {
164:     mark->cw = mark->mw;

The impact on the round robin is that -as soon as a server is 
deleted/editted- the process of selecting a new destination starts over, 
which isn't very bad because that happens all the time when you have two 
servers whose weight isn't the same.

The attached patch will set the current weight to 0 in case of an update.

>  I'm one of the people who think you shouldn't dynamically change the
> weights of your realservers unless you've got a real good reason. 
The main reason for this is to swap out servers which are getting 
overloaded with requests or for instance a server administrator who 
compiles a new kernel locally ;-). Or, like we had recently, a server with 
some defective hardware who still served requests but with a very nasty 
responsetime. And of course to swap out servers who are totally dead.

> I know Jeremy Kerr did an honours project on this topic, so he's probably
> looked into the control and feedback theory on the matter and knows more
> about it than I do.
Any chance of Jeremy reading this and tell us if we can find that project 
(or conclusions) online somewhere? 

> If you're going to dynamically reweight your machines, then you should
> do it on a timescale that is long compared to the events that they are
> handling, ie if you're handling http hits, then the calculation of the
> new weight should sample the load no more than every few secs. If the
> load comes from https, then every few minutes at the most.
At the moment im reweighting the server every 10 seconds, when in 
production that will be slightly higher, somewhere around 30 seconds 
between updates.

> No-one (except perhaps Jeremy) has done a study of the benefits of
> dynamic weighting, so my statements are just theory at the moment.
I'm going to try both (dynamic and static) to see what's best in my 
situation. At least the server wont refuse my connections now and than 
when my weights are changed or when a server dies and is taken out the 
pool by a monitor script.

>
> Joe
-kees




<Prev in Thread] Current Thread [Next in Thread>