On Mon, 15 May 2000, Julian Anastasov wrote:
> I think, nobody have compared LVS and other commercial
> balancer.
Fair enough. It doesnt have to be a comparison with anything. There should
be some simple numbers which say if you use a PII-333MHZ, with an ASUS-x
m/board and 512MB of RAM, you should expect blah blah blah
> At least I don't know for such benchmarks. But the facts
> are these:
>
> - we use large hash table for connections, configured by default to 4096
> rows, two pointers for row
>
> - each entry allocates ~128 bytes => 512MB are 3,000,000 - 4,000,000
> entries. You know, the TCP states have different timeouts which can be
> tuned in the latest LVS versions.
So the 128 bytes includes all the state maintainance details i take it.
>
> If we must say what will be the connections/sec rate, lets
> estimate it to 3,000,000/60 (60=FIN_WAIT) => 50000 requests/sec.
> So, with 512MB RAM we have near 50000 req/sec. We didn't considered
> the CPU overhead, DoS, how many seconds one requests is served.
> We just estimated the max request rate based on the RAM and
> the minimum entry life, i.e. if the connections are just created
> and terminated. So, in normal cases the rate of 50,000 reqs/sec
> must be considered as upper limit based on the 512MB RAM.
> So, if you don't include the DoS attack in your calculations
> 512MB is enough. But! We don't know how many packets are
> transferred for one connection. So, my answer is: it depends. The
> connection rate is not enough, to answer the question "Is
> 512MB enough" we must ask for the connection life time.
>
3-4M connections is a huge number. The largest bottleneck i see is your
search algorithm not your RAM. [Your RAM bandwidth is probably a non-issue
these days. 100Mbytes/sec RAM throughput is typical.]
So, you cant ignore your search times.
> In my tests with a web server configured with keepalive=off
> the established and non-established connections in the LVS
> table are 1:10 to 1:15. With default keepalive=15 seconds they
> are ~ 3:5. This is one normal web server where the established
> time is not 0 seconds. The actual number of the entries in
> the LVS table are in /proc/sys/net/ipv4/ip_always_defrag.
> When checking the RAM limits this value is very useful!
> If the counter reaches 3,000,000 we are in dangerous area.
> Also the memory state can be checked with "free".
>
In the keepalive are connections aggregated (as in http 1.1)?
> About the CPU overhead. Yep, in 2.2 we can't be faster
> on SMP. Some people try with "top" to show how free is the
> CPU but that is not correct. By default we don't try to reach
> the CPU limits :) We can use many directors to handle the
> traffic. There are always solutions.
There are always solutions; however, lets take just a simple case. A
single box; lets get some numbers for that.
>
> In 2.4 we will try to solve the problem by a new locking
> strategy in the LVS code and the LVS will work better on SMP. The main
> critical regions in LVS are the access to the connection table and the
> schedulers. By this way we can better support multihomed SMP directors
> and directors which are used as real servers too.
>
2.3 networking scales linearly with number of processors. Example typical
routing max capacities are around 30-40K packets/sec; same in 2.3 but if
you add another procesor, you end somewhere in the 60-70Kpps with 2.3 (2.2
doesnt change).
your connection setup because it includes a little more overhead than
typical routing should get less performance.
> Some time ago I found in the mail list examples for the LVS
> usage in very busy sites. I can't give examples now but everyone can try
> to search the web site and the mailing list. In fact, not everyone
> shows what traffic handles its site powered by LVS. We can only
> estimate. May be there are many happy users for which we don't know :)
> I have never found complains in the mailing list
> about the fact that LVS code (the CPU overhead) is a bottleneck.
heh. thats a bad answer; almost political ;->
> Not on 100Mbps at least :) The problem is that with small number
100Mbps is peanuts. Given the games played in comparing Load balancing
performance numbers, it may not mean much.
So why dont we get Ashok to do some tests on his h/ware and settle
this. I am actually shocked that nobody has done any performance tests and
all i see is hand waving about how nobody is complaining.
Lets get some numbers.
cheers,
jamal
PS:- Load balancing h/ware is becoming commodity. If this piece of h/ware
can give me more performance than Linux on a PC; and say it was much
cheaper than the PC (highly possible), then the question is: Why should i
use LVS?
|