On Wed, 12 Jan 2000, Horms wrote:
> > Why do you think that NAT is so slow? Any benchmarks?
>
> Not yet, I am hoping to be able to get some numbers next week.
>From mack@xxxxxxxxxxxx Sat Jan 22 17:31:02 2000
Date: Sat, 22 Jan 2000 17:30:02 -0500 (EST)
From: Joseph Mack <mack@xxxxxxxxxxxx>
To: mack@xxxxxxxxxxx
Subject: dr-nat.performance
Test Setup:
Hardware -
Client: 133MHz Pentium, 64M memory, 2.0.36 libc5
Director: 133MHz Pentium, 64M memory, 0.9.7-2.2.14 glibc_2.1.2
Realservers:75MHz Pentium, 64M memory, some 2.0.36 libc5, some 2.2.14
glibc_2.1.2
Network: NIC - Intel eepro100 (realservers), netgear FA310TX (client, director)
switch netgear FS308 (8-port).
Software -
Netpipe ftp://ftp.scl.ameslab.gov/pub/netpipe/netpipe-2.3.tar.gz
Some comments:
Netpipe:
The netpipe test sends tcp packets of ever increasing size
(starting at 0 byte data) to the target, which the target
returns to the sender. The test stops when the return time
reaches a predetermined time (default 1 sec). The return
time of the small packets allows determination of the
latency of the network. As the packet size increases, the
tcpip layer will fragment (at the source) and then
defragment them (at the destination). The larger packets
have sizes in the Mbytes and test the throughput of the
system. After the peak throughput is reached, the throughput
decreases with larger packet size. As it turns out the
throughput-v-returntime curve is not alway smooth indicating
software and/or hardware problems. Some of the problems are
partially understood (some patches are available
- http://www.icase.edu/~josip/).
These patches were not applied here.
LVS is being tested mostly with protocols that have small
request and large reply packets (eg ftp, http). At first
sight then netpipe, which sends and receives packets of the
same size, would not seem to be a good test of LVS. However
netpipe confirms our prejudices about VS-NAT and doesn't
load the director when running the LVS under VS-DR. Perhaps
it isn't all that bad. (Other tests are around, Julian
has sent me one - I'll try it next).
Ethernet cards:
The EEPRO100 cards worked well in all the above machines.
They gave reasonable throughput and the loss of throughput
with very large packets was small and smooth. The curve of
throughput-v-returntime was smooth.
The FA310TX has a well tested chip set and uses the well
tested tulip driver. Surprisingly this card did not work
well with the above hardware (all relatively old - chosen
because it was cheap). It was not possible to run these
cards in the 75Mhz pentium machines: communication was
erratic (eg through a telnet session), some replies
taking a long time. I thought I had a bad cable connection.
The netpipe tests showed wildly different return times and
throughput for consecutive nearly identical packets and
the netpipe run would lockup before finishing. These cards
were usable however in the 133Mhz machines, where they
gave higher throughput and lower latency than the EEPRO100
cards. The throughput-v-returntime curve was rough and the
loss of throughput at large packet size was dramatic and
rough.
It was unfortunate that the FA310TX had problems in the 75MHz
machines as this NIC is about half the price of the EEPRO100
with almost twice the performance.
Here are the latency(msec),throughput(Mbps) for the pairwise
connections of cards (via a crossover cable).
FA310TX EEPRO100
FA310TX 0.15,70
EEPRO100 0.25,50 0.33,45
The lesson here is to test NIC/mobo combinations and not
to assume they are going to work because they should.
Switch/Network:
The realservers (with EEPRO100 NIC) connected through the
switch to the director and client (both with FA310TX NICs).
The maximum throughput of client-realserver would be 50Mbs.
The connection through the switch had a measured latency
of 0.3msec with throughput of 50Mbs, indicating that the
switch was not a rate limiting step in the connection.
Supposedly this 8-port switch can handle 4x100Mbps
connections and would not be running near to its capacity.
Test setup:
The tests used 1 client, a director with 1 NIC and 1..6
realservers. The throughput was tested 3 ways by
connecting from the client to the realserver(s)
1. directly
2. by VS-NAT
3. by VS-DR
All machines were accessed by telnet/xterm from the
director. All machines ran top to monitor %CPU use
and load average on all machines. (Latencies were
all about 0.3msec and are not listed). To connect
to multiple realservers, multiple xterms from the
client were setup in a window on the director and
the command entered into each window without a
carriage return. The processes were started by moving
the mouse and hitting <cr> in each window, starting
all processes within 2-3secs.
The throughput is the sum of all the connections.
For large numbers of simultaneous connections, the
throughput in each window would increase monotonically.
Before it flattened off the throughput would then
start jumping about in each window (eventually, at large
packet sizes by a factor of 2). Presumably the throughput
in one window was going up while another was going down,
due to collisions somewhere. The throughput here is the
value just before it started behaving erratically.
(At much larger packet sizes again the throughput
stabilises in each window, with the total throughput
reduced by 25% for VS-DR and 6 realservers).
Results:
Connection direct VS-NAT VS-DR
Throughput(Mbps)
realservers
1 50 60 50
2 80 68 80
3 95 66 100
4 100 68 100
5 125 75 125
6 120 72 120
Loadaverage:
Realservers
about 0.3 with CPU usage for NPtcp at about 25% in the
three situations above (the load does not vary much with
size of the packets or the throughput, presumably
lots of small packets which don't give much througput
use just as much CPU as do the big packets).
Director
With VS-DR little work seemed to be being done
- the load average stayed low (<0.5) and system CPU
use was <20%.
With VS-NAT the director handled one realserver OK.
With more realservers, the loadaverage zoomed, the
monitor/kbd/mouse froze (even though packets were being
delivered). The director could be recovered by logging
into the client and killing the netpipe jobs.
Client
Unlike the realservers, where the loadaverage is
independant of the packet size, on the client
the loadaverage increases throughout the netpipe
run with increasing packet size.
The load on the client was a function of the throughput
and was independant of the system it was feeding packets to.
The load was highest for VS-DR and the direct connection
(which were nearly the same) and lower for VS-NAT.
For VS-NAT, the CPU usage for upto 4 realservers was
only 60% (reached 95% at 5 realservers), but the
loadaverage never got very high (1.7 for 6 realservers
at maximum throughput).
For direct connection and VS-DR the CPU usage was >90%
feeding 4 or more realservers. The loadaverage was 4 for
6 realservers at maximum throughput.
Conclusion:
With VS-NAT, the director saturates on a 50Mbps link
with 2 client processes and throughput drops with further
client processes. VS-NAT shouldn't be used when large
numbers of realservers have to operate under high load.
The real problem is not the lowered throughput, but
that you can't access your director.
With VS-DR, the network and client machine saturates at
4 client processes. Throughput is the same as having the
client connecting directly to the realservers. There
is little detectable load on director even when the NIC
is saturated.
Joe
--
Joseph Mack mack@xxxxxxxxxxx
----------------------------------------------------------------------
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
To unsubscribe, e-mail: lvs-users-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: lvs-users-help@xxxxxxxxxxxxxxxxxxxxxx
|