Re: [lvs-users] LVS-DR performance not good as expected

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] LVS-DR performance not good as expected
From: Anders Henke <anders.henke@xxxxxxxx>
Date: Thu, 21 Mar 2013 13:27:42 +0100
On 21.03.2013, Dongsheng Song wrote:
> I setup a testing LVS-DR cluster, use 1 director with 8 Windows real server.
> ALL server running as VMware ESX 5.1 guest.
> The director have 2 cores (X7550@2GHz), 2 GB memory, Debian 6.0 with
> kernel 2.6.32-5-amd64.
> The Windows server have 4 cores  (X7550@2GHz), 4GB memory, Windows
> 2008 Standard SP2.
> When I running load testing against VIP from 4 client machines, 2
> JMeter instance per client machine,
> i.e. 8 JMeter instance, I got about throughput is 8390/minute.
> Then I running one JMeter instance access one real server (of course,
> this is not cluster), the total
> 8 JMeter instances can get up to 9440/minute.
> That's said, when I use LVS-DR, I got 12.5% performance degrade.

In some areas, the performance impact by virtualization is dramatically
high (especially in networking), in others the performance impact 
(cpu-intensive tasks like calculation) it can be as low as 3-5%.
As a rule of thumb, most kind of virtualization result in 10-20% of 
overall performance loss, which fits your "12.5%" fairly well.

> Any suggestion ?

How many packets per second does the load balancer actually transport?
For a quick estimate, do check the packet counters in output from "ifconfig"
before and after testing. With this figure, one can roughly estimate the
overhead from networking virtualiziation.

The virtualization layer does add some latency to your network. While
the overall amount may look low, compare it to the usual latency on your
network and you may experience an overall roundtrip increase of 200%.

If you're putting a virtualized load balancer in front of this,
you'll add not only the "natural" latency but also the extra
latency by the virtualization layer - so in the end, the latency
from a hardware client via VM-loadbalancer to VM-realserver may be 
3-4 times as high as a hardware client accessing a hardware realserver.

VMware essentially says "it only adds a few microseconds", but if you're
comparing figure 2, 3 and 4 with each other, for ping you essentially see 
around 20 microseconds for ping from "hardware to hardware", around 
40 microseconds for ping from "hardware to VM" and 30 microseconds for 
"VM to VM on same node".

If I take those figures, introducing a "VM to VM"-loadbalancer to a VM
real server does add at least 15 microseconds of extra network latency,
totally adding up to 55 microseconds of network latency when accessing
this setup from a hardware client (vs. 20 microseconds without any
virtualization or loadbalancing).
If your application is latency-sensitive, it's probably best to set
aside the balancer as dedicated hardware.

Another point is CPU overcommitment: adding more virtual cores to a
server doesn't necessarily increase the performance, it may also worsen 
performance. You've assigned 8*4 cores for the real servers and 2 more 
cores for the loadbalancer. The x7550 does offer 8 real cores, so set 
aside one core just for the virtualization layer and management, your 
hypervisor has around 7 cores to spend.

When you're accessing a single VM, the other VMs are idle: your
hypervisor can assign 4 out of 7 available cores to that single VM,
resulting in near-hardware CPU performance. If you're accessing
four VMs at the same time, the hypervisor effectively has to switch
34 virtual CPU cores competiting for each other with 7 real cores,
giving you much less overall performance. That's just a generic issue
of overcommitment in virtualization environments and not specific
to loadbalancing.

My advice:
-test the performance of accessing all VMs in parallel without using the
 loadbalancer. If your application is not sensitive to network latency,
 the result shouldn't be very far from your current results and 
 probably a little better (as you're reducing network latency).
-do assign less CPU cores to each VM to reduce the overall hypervisor 
 switching overhead, test again. The performance should improve a lot.
-if network latency is an issue, do put your balancer on "real" hardware.

1&1 Internet AG              Expert Systems Architect (IT Operations)
Brauerstrasse 50             v://49.721.91374.0
D-76135 Karlsruhe            f://49.721.91374.225

Amtsgericht Montabaur HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, 
Robert Hoffmann, Andreas Hofmann, Markus Huhn, Hans-Henning Kettler,
Dr. Oliver Mauss, Jan Oetjen, Martin Witt, Christian Würst
Aufsichtsratsvorsitzender: Michael Scheeren

Please read the documentation before posting - it's available at: mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to

<Prev in Thread] Current Thread [Next in Thread>