Simon Horman wrote:
> On Mon, May 04, 2009 at 10:31:59AM +0200, Christian Frost wrote:
>
>> Hi,
>> We have a setup including two real servers each of which runs an
>> instance of MySql with the max_connections option set to 1000. In this
>> setup we have run some performance tests with mysqlslap two determine
>> the throughput of the setup. These tests involve simulating many
>> simultaneous users querying the database. Under these conditions we have
>> encountered some problems with the load balancer. Specifically, using
>> ipvsadm -L -n to monitor the connections during the performance test
>> there are intitially many connections represented as inactive. After a
>> few seconds the inactive connections are represented as active in the
>> respective real server. This causes a problem when the Least-Connection
>> Scheduling algorithm is used because the connections are not equally
>> between the two real hosts. The two real hosts are almost equal in terms
>> of processing capacities.
>>
>> In the following the output of ipvsadm -L -n is shown which probably
>> explains the problem better.
>>
>> ipvsadm -L -n a few seconds in the test simulating 200 MySql clients
>> connecting simultaneously.
>>
>> IP Virtual Server version 1.2.1 (size=4096)
>> Prot LocalAddress:Port Scheduler Flags
>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>> TCP 10.0.1.5:3306 lc
>> -> 10.0.1.2:3306 Route 1 71 0
>> -> 10.0.1.4:3306 Route 1 70 60
>>
>>
>> ipvsadm -L -n after 30 seconds in the test simulating 200 MySql clients
>> connecting simultaneously. Note that the load balancer uses the
>> Least-Connection scheduling algorithm.
>>
>> IP Virtual Server version 1.2.1 (size=4096)
>> Prot LocalAddress:Port Scheduler Flags
>> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
>> TCP 10.0.1.5:3306 lc
>> -> 10.0.1.2:3306 Route 1 71 0
>> -> 10.0.1.4:3306 Route 1 130 0
>>
>>
>> The problem does not occur if the connections are made sequentially and
>> if the number of total connections is below about 100.
>>
>> Is there anything we can do to avoid these problems?
>>
>
> Hi Christian,
>
> I'm taking a bit of a stab in the dark, but I think that the problem that
> you are seeing is with the lc (and wlc) algorithms interraction with burst
> of connections.
>
> I think that the core of the problem is the way that lc calculates the
> overhead of a server. This being relevant as an incomming connection is
> allocated to whichever real-server is deemed to have the lowest overhead
> at that time.
>
> In net/netfilter/ipvs/ip_vs_lc.c:ip_vs_lc_dest_overhead()
> overhead is calculated as:
>
> active_connections * 256 + inactive_connections
>
> So suppose that things are in a more or less balanced state,
> real-server A has 71 connections and real-server B has 70.
>
> Then a big burst of 60 new connections comes in. The first of these new
> connections will go to real-server B, as expected. This connection will be
> in the inactive state until the 3 way handshake is complete. So far so good.
>
> Unfortunately, if the other 59 new connections come in before any of the
> other new connections complete the handshake and move into the active
> state, they will all be allocated to real-server B because:
>
> 71 * 256 + 0 > 70 * 256 + n
> where: n < 256
>
> Assuming that I am correct I can think of two methods of addressing this
> problem:
>
> 1) Simply change 256 to a smaller value. In this case 256 basically
> ends up being the granularity of balancing for bursts of connections.
> And in the case at hand, clearly 256 is too coarse. Perhaps 8, 2 or
> even 1 would be a better value.
>
> This should be a trivial change to the code, and if lc is a module
> you wouldn't even need to recompile the entire kernel - though you
> would need to track down the original kernel source and config.
>
> The main drawback of this is that if you have a lot of old, actually
> dead, connections in the inactive state, then it might cause imbalance.
>
> If that does help it might be good to consider making this parameter
> configurable at run time, at least globally.
>
> 2) A more complex though arguably better approach would be to implement
> some kind of slow start feature. That is, to assign some kind of weight
> to new connections. I had a stab at this one in the past - it should
> be in the archives - though I think my solution only addressed the
> problem for active connections. But the idea seems reasonable
> to extend to this problem.
>
>
Hi,
We tried method 1, which turned out to balance the connections
perfectly. We multiplied with 1.
Thank you.
/Christian
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|