LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

[lvs-users] LVS RR becomes unbalanced after time

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: [lvs-users] LVS RR becomes unbalanced after time
Cc: Luc van Donkersgoed <luc@xxxxxxxxxxxxxxxxxx>
From: Luc van Donkersgoed <luc@xxxxxxxxxxxxxxxxxx>
Date: Mon, 4 Jul 2011 15:25:19 +0200
Hi all,

I'm running a small apache cluster (2 loadbalancers in active-passive setup, 2 
realservers serving HTTP and HTTPS). 

All machines are Dell PowerEdge (R200 and R410 series), not older than 2 years, 
running Ubuntu 11.04 with all packages updated. 

My loadbalancers are configured with:

Heartbeat 3.0.4
Linux Director v1.186-ha
ipvsadm v1.25 2008/5/15 (compiled with popt and IPVS v1.2.1)

====  ldirectord.cf ====

checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=no

virtual=x.y.z.141:80
        real=x.y.z.135:80 gate 50
        real=x.y.z.136:80 gate 50
        fallback=127.0.0.1:80 gate
        service=http
        request="ldirector.html"
        receive="Test Page"
        scheduler=wrr
        protocol=tcp
        checktype=negotiate

virtual=x.y.z.141:443
        real=x.y.z.135:443 gate 50
        real=x.y.z.136:443 gate 50
        fallback=127.0.0.1:80 gate
        service=https
        request="ldirector.html"
        receive="Test Page"
        scheduler=wrr
        protocol=tcp
        checktype=negotiate

====  ipvsadm -ln ====

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  x.y.z.141:80 wrr
  -> x.y.z.135:80             Route   50     110        51        
  -> x.y.z.136:80             Route   50     103        59        
TCP  x.y.z.141:443 wrr
  -> x.y.z.135:443            Route   50     12         14        
  -> x.y.z.136:443            Route   50     12         6  

==== the problem ====

When I (re)start my loadbalancer, the load is evenly balanced over my two 
realservers. The output of ipvsadm -ln is at that moment comparable to the 
output above. This is all as I would expert. Then, after somewhere between 30 
minutes and 2 hours, the results of ipvsadm -ln changes to something like this, 
while the load on the webservers does not significantly change:

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  x.y.z.141:80 wrr
  -> x.y.z.135:80             Route   50     0        1        
  -> x.y.z.136:80             Route   50     1        1        
TCP  x.y.z.141:443 wrr
  -> x.y.z.135:443            Route   50     2         1        
  -> x.y.z.136:443            Route   50     1         0  

Around the same time, the loadbalancer seems to become unbalanced, sending all 
requests to one server. This is not always the same server, it seems to be 
random. This server then becomes heavily loaded, while the other server is 
idling. After a while (perhaps 30 minutes) the loadbalancer starts to send 
packages to the unused realserver again. A little while after that, the balance 
tips again, often preferring the other server this time. 

The problem is always solved by restarting heartbeat, at which time the other 
loadbalancer takes over and starts to distribute the load evenly again. Then, 
after a while, this server starts to display the same issue.

A source for my problem might be found when running ipvsadm -ln --stats, which 
displays the following:

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port               Conns   InPkts  OutPkts  InBytes OutBytes
  -> RemoteAddress:Port
TCP  x.y.z.141:80                20899   198954        0 21196756        0
  -> x.y.z.135:80                10449   101877        0 10743318        0
  -> x.y.z.136:80                10450    97077        0 10453438        0
TCP  x.y.z.141:443                2852    46171        0  8971996        0
  -> x.y.z.135:443                1426    23411        0  4485847        0
  -> x.y.z.136:443                1426    22760        0  4486149        0

This would suggest that the number of connections are still evenly distributed 
over the two realservers, even if the realservers themselves don't agree. 

Can anyone help me locate the reason for the round robin scheduler not 
distributing my requests evenly?

Thanks in advance,
Luc van Donkersgoed
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>