lvs-users
|
To: | lvs-users@xxxxxxxxxxxxxxxxxxxxxx |
---|---|
Subject: | why does my lvs/dr director stop at 65k connections? |
From: | Matthijs van der Klip <matthijs.van.der.klip@xxxxxx> |
Date: | Sat, 10 Nov 2001 14:57:53 +0100 |
Hi, I've been playing with LVS for a few months now and have had a LVS/NAT solution in production for a few weeks. This same LVS/NAT has now been replaced by a LVS/DR solution for a few days. First a description of the current DR setup: - I have four 1Ghz machines running RedHat 7.1, kernel 2.4.9-6SGI_XFS_PR3, tux-2.1.0-2, apache-1.3.19-5, ipvsadm-1.17-2 and iptables-1.2.1a-1. - All machines have two onboard ethernet interfaces. Interfaces eth0 are public interfaces, eth1 are private interfaces. Private interfaces have been configures with private ip adresses 192.168.0.1, .2, .3 and .4. - I have created a script which does one of the following things (dependant on the function of the machine: director or realserver): a) Director: setup an alias eth0:0 on the VIP and setup an LVS using ipvsadm to route among 192.168.0.1, .2, .3 and .4. The director functions as a realserver too. b) Realserver (if not a director): instruct iptables to locally redirect packets destined for the VIP. A typical session: - A packet arrives at eth0 of the director (can be any of the four machines, at this moment it is the machine we chose it to be). LVS picks (wlc) any of the four private addresses and redirects the packet. The packet is being sent out on eth1 to the private network*. - The packet arrives at eth1 of a realserver. The packet is being caught by iptables and locally redirected. The webserver (first tux then apache) processes the packet. A packet in response is being sent on interface eth0 towards the default gateway of our server network. * One exception: if the packet is being load balanced to the local webserver (director == realserver), it is, of course, processed internally and not sent out on the private network. Some background facts: - The kernel I use has not been hand configured/compiled by me. It is a stock RedHat kernel modified by SGI to include XFS. It includes IPVS by standard. It has a table size of 16 bits eg 65536. - I have a custom hit tester (run from an Origin 200) which can generate between 3000 and 3500 hits/connections per second. - When I test a single server (by throwing more than 3000 hits/second at it), it's maximum number of simultaneous connections (limited by ip_conntrack; currently configured at 32768) is quickly saturated. It takes approximately 10 seconds (32768/3000) until ip_conntrack starts dropping packets. My problem/question: - When I test the LVS (again by throwing more than 3000 hits/second at it), it tops at about 16384 (*4=65536) connections (inactcon) per realserver. Packets are not being dropped by ip_conntrack at the realservers so it looks like they're being dropped at the director. My question is: why are these packets being dropped? I expected a maximum of 4*32768 = 131072 connections before packets being dropped (by ip_conntrack again). - I have done a second testrun where I removed the director as a realserver, so I had three realservers instead of four. This time the number of connections (inactcon) topped at about 21000 (again *4=65536) per realserver. What is the limiting factor in this story? I have searched the mailing archives and it has been explained there several times that a table size of 65536 does _not_ mean a maximum of 65536 connections. I expected to be able to saturate the webservers (due to the tcp TIMEWAIT state timeout), but I did not expect any limitations (other than RAM/CPU etc) in the LVS itself. The reason I switched from LVS/NAT to LVS/DR was exactly because I hit this limit of 65536 simultaneous connections (which I then believed was to blame the NAT tables). I hope I have explained the situation/problem clear enough. This setup has to be able to handle >3000 hits/s in the near future, so I hope you will be able to help me. Best regards, Matthijs van der Klip
|
<Prev in Thread] | Current Thread | [Next in Thread> |
---|---|---|
|
Previous by Date: | web site will be down for a while, Wensong Zhang |
---|---|
Next by Date: | Re: why does my lvs/dr director stop at 65k connections?, Julian Anastasov |
Previous by Thread: | web site will be down for a while, Wensong Zhang |
Next by Thread: | Re: why does my lvs/dr director stop at 65k connections?, Julian Anastasov |
Indexes: | [Date] [Thread] [Top] [All Lists] |