DR Load balancing active/inactive connections

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	DR Load balancing active/inactive connections
From:	RU Admin <lvs-user@xxxxxxxxxxxxxxxxxx>
Date:	Tue, 21 Nov 2006 08:57:59 -0500 (EST)

I've been using IPVS for almost two years now, I started out with 6machines (1 director, 5 real servers) and was using LVS-NAT. During thefirst year that I was running that email server everything workedperfectly with LVS-NAT. About a year ago, I decided to setup anotheremail server, this time with 5 machines (1 director, 4 real servers) anddecided it was time to get LVS-DR working, which I successfully did. Ithen decided to switch over my first email server (the one with 6machines) to LVS-DR, since the other LVS-DR server was working great.Both of my email servers have been working great with LVS-DR for the pastyear, with one major exception (which has just recently started gettingworse, because of the large volumes of connections coming into theservers). The problem I am having is that my active/inactive connectionsare not being listed properly. What I mean, is that the counter for myactive/inactive connections just keep going up and up, and are constantlybeing skewed. I read through a good number of archived messages on thismailing list, and I keep seeing everyone saying "Those numbers ipvsadmare showing, are just for reference, they don't really mean anything,don't worry about them." Well, I can tell you first hand, when you usewlc (weighted least connections), those number obviously DO meansomething. My machines are no longer being equally balanced betweenbecause my connection counts are off, and this is really effecting theperformance of my email servers. When running "ipvsadm -lcn", I can seeconnections with the CLOSE state going from 00:59 to 00:01, and thenmagically going back to 00:59 again for no reason. The same holds truefor ESTABLISHED connections, I see them go from 29:59 to 00:01 and thenback to 29:59, and I know for a fact that the connection from the clienthas ended.

I'm currently using "IP Virtual Server version 1.2.0", and I know thatthere is a 1.2.1 version available, but my problem is that my emailservers are in a production environment, and I really don't want torecompile a new kernel with the latest IPVS if that isn't going to solvethe problem. I'd hate to cause other problems with my system because of amajor kernel upgrade.

I can only hope that someone has some suggestions, I am a firm supporterof IPVS, and as I said I've been using it for 2 years now and one of myemail servers handles over 30,000,000 emails in one month (or almost 1million emails a day). So we heavily relying on IPVS. There is anotherdepartment in our organization that spent thousands of dollars onFoundryNet load balancing productions, and I've been able to accomplishthe same tasks (and handle a higher load) by using IPVS, so clearly IPVSis a solid product. Unfortunately, I just really need to figure out whatis going on with the connection count problems.

I not sure what information you guys need, but here's some info about mysetup. If you need any more details, feel free to ask.


6 Dell PowerEdge SC1425
Dual Xeon 3.06Ghz processors
2GB DDR
160GB SATA
Running Debian Sarge

1 machine is the director, the other 5 are the real servers. All 6machines are on the same subnet (with public IPs), and the director isusing LVS-DR for load balancing. Just to give you an idea as to the typesof connection numbers I'm getting:

  Prot LocalAddress:Port Scheduler Flags
    -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
  TCP  vip.address.here:smtp wlc
    -> realserver1.ip.here:smtp     Route   50     648        2357
    -> realserver2.ip.here:smtp     Route   50     650        2231
    -> realserver3.ip.here:smtp     Route   50     648        2209

Whereas when using LVS-NAT (which was 100% perfect), my numbers would besomething like:

    -> realserver1.ip.here:smtp     Route   50     16        56
    -> realserver2.ip.here:smtp     Route   50     14        50
    -> realserver3.ip.here:smtp     Route   50     15        48

I use keepalived to manage the director and to monitor the real servers.The only "tweaking" that I've done to IPVS, is I have to run this:

  /sbin/ipvsadm --set 1800 0 0

before starting up keepalived, just so that the active connections willstay active for 30 minutes. In other words, we allow our users to idletheir connection for 30 minutes, and after that, then the connectionshould be terminated. And I put "0 0" there, because from what I'veread, that tells ipvsadm to not change those other two values (in otherwords, leave the defaults as is).

That's about all I can think of, the only other wierd thing that I hadto do was to tweak some networking settings on the real servers to fixthe pain-in-the-@$$ ARP issues that come with DR. But I doubt thosechanges would have anything to do with the director's load balancingproblems. Those tweaks were only done on the real servers, and they wereto just silence the broadcasting of the MAC address for the VIP (dummy0)interfaces on the real servers.

And for those interested, I switched from LVS-NAT to LVS-DR, because Ireally feel that you can get much better network throughput by using DRinstead of NAT. I know I've read a bunch of messages on the mailing listsaying that NAT is just as good, but I think the one major advantage ofIPVS is that it supports DR, whereas almost every other load balancingproduct I've seen uses some type of NATing (in other words, all networktraffic goes in and out of the director). To have a setup, like I do nowwhere only incoming traffic has to go through the director, is absolutelyfantasic, because the cluster (for lack of a better word) can be easilyexpanded. With LVS-NAT, when you add more real servers, all you get ismore CPU power, you don't get any more network throughput, with LVS-DRwhen you add a new real server, you completely expand your cluster, notjust one part of it.

Sorry for the long email. But I really would appreciate any help that canbe provided.


Thanks!

Craig

<Prev in Thread]	Current Thread	[Next in Thread>
DR Load balancing active/inactive connections, RU Admin <= Re: DR Load balancing active/inactive connections, Horms Re: DR Load balancing active/inactive connections, RU Admin Re: DR Load balancing active/inactive connections, Horms Re: DR Load balancing active/inactive connections, Joseph Mack NA3T

Previous by Date:	RE: Geographically separated load balancers?, Purcocks, Graham
Next by Date:	LVS NAT - Suse OSS 10.0, Davin Menhinck
Previous by Thread:	Having got over the excitement of getting SNMP data out of LVS..., Malcolm
Next by Thread:	Re: DR Load balancing active/inactive connections, Horms
Indexes:	[Date] [Thread] [Top] [All Lists]