LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: static load balancing patch

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: static load balancing patch
Cc: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: "Brett E." <brettspamacct@xxxxxxxxxxxxx>
Date: Thu, 25 Mar 2004 14:40:57 -0800
(Expanding the audience)

Here is the preliminary patch so others could comment:

http://www.parrotconsulting.com/lvs-patches/sth-patch-1


Julian Anastasov wrote:
        At first look, this scheduler looks too complex.
Also, I'm not sure what is the best way to allow propagating the
weight changes to the scheduler's real server hash map while
keeping all existing associations between hash key and real server.
If it happens, changing the real server for particular client just
for weight changes is not good. You can get some ideas from LBLC
and LBLCR. The end goals should be:

I should have said that there was a dicussion about this on the mailing list:

http://marc.theaimsgroup.com/?l=linux-virtual-server&m=107609187520235&w=2

Wensong didn't disagree with it, he did offer tips.

Currently, if you change the weight, all connections are potentially affected. This can be corrected in the code but it would be a bit messy - say you decrease the weight by 2 of a real server. In order to preserve existing connections, you would have to increase the weight by 2 of another real server so the total number of buckets remains the same, but the weighting reflected in the buckets doesn't represent the weighting the user might expect. You can manually decrease the weight of one real server while increasing the weight for other realservers, preserving connections, but it isn't very fullproof.

If a RS fails, then I assumed we would delete it (ipvsadm -d -t) which would trigger creating another bucket so that the previous buckets would point to the new bucket. All previous connections should be unaffected except the connections which went to the RS which was deleted. Those would be remapped to the available realservers.



- weight change should not break existing associations (client->RS),
the greatest common divisor can help here to properly build
the hash map properly.

How would GCD help? It seems like a weight change will break all connections no matter what unless you offset the weight change by changing the number of buckets for other real servers. Otherwise, GCD does possibly allow us to decrease the number of buckets if we divide the weights by the GCD to get the number of buckets. But I don't know that this would really save us enough memory.

If the weights were 10, 10 and 100 we could divide by the GCD of 10 and create 1, 1 and 10 buckets, but it really doesn't save us all that much memory so I don't know if it would be worth it.


- failed RS (even with weight 0, may be which is not in the code)
should be replaced with mapping to other real server(s), if preferred
by keeping the relative weights

I didn't think about weight 0, I will add that. When a RS fails, it rmaps to other realservers. Not too sure what you mean by relative weights but the weights are preserved, just the number of buckets decrease.


        There are other questions that raise, for example:

- this is not the way to implement persistence, if it is the end goal,
read about AOL caches, same client comes from different SADDR (caches)
for the different connections it makes

Yeah I am very aware of AOL caches, the trick would be to adjust the weighting manually so you could dedicate fewer realservers to AOL IP's.

It's definitely a tradeoff, full of gotchas like this, which I should clearly document. Updating is slow and complex. Lookup should be fast and consume little memory.


- if conns are not hashed => no ICMP notifications are forwarded.
I and Horms at different moments implemented per-packet scheduling,
for example: http://www.ssi.bg/~ja/ops-0.9.5-4.diff
As result, the "hashing" is controlled per service, not per
scheduler.

        As result, I do not know who uses the existing SH scheduler
at all. If you prefer, you can start discussion on the mailing
list because the schedulers are not my best topic :)


Interesting, the code is very similar.

I thought about doing it that way but I figured that if you use this new scheduler you will never want to use the connection hash so you shouldn't give the user the option.


Instead of using read_atomic on dest->weight it just looks like
dest->weight.counter, a hack for now. But it seems to work just fine for
me.  It would be nice to make weight so it's not an atomic type.


        We can do it one day. The problem is that I only fix bugs,
I do not have time for new features. Only Wensong is the maintainer, so
> you have to persuade him with good quality code :)


Thanks


<Prev in Thread] Current Thread [Next in Thread>