LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

RE: [Linux-HA] UDP / DHCP / LDIRECTORD

To: 'Simon Horman' <horms@xxxxxxxxxxxx>
Subject: RE: [Linux-HA] UDP / DHCP / LDIRECTORD
Cc: "'linux-ha@xxxxxxxxxxxxxxxxxx'" <linux-ha@xxxxxxxxxxxxxxxxxx>, 'lvs-devel' <lvs-devel@xxxxxxxxxxxxxxx>, 'Julian Anastasov' <ja@xxxxxx>
From: Brian Carpio <bcarpio@xxxxxxxxxxxx>
Date: Thu, 24 Feb 2011 14:46:48 -0800
All,

So this patch has been working for us flawlessly for the last 5 months or so. 

Our infrastructure is 100% virtualized, the other day our loadbalacner01 had a 
memory leak and crashed, since we use ldirectord with heartbeat loadbalacner02 
took over, however ever since then it seems like the single packet UDP 
scheduling has stopped working. Even if I fail back over the loadbalacner01 VM, 
I still see all the DHCP traffic going to only one backend server. 

If I run ipvsadm -L -n I can see that ipvsadm thinks both of the backend 
servers are up since the weight is set to 1 for each server, if I reboot the 
second backend server the one which is not receiving any traffic then run 
ipvsadm -L -n I can see its weight go to 0 and in the ldirectord log I can see 
that its marked dead. 

I have exported one of the loadblancers and one of the backend servers (using 
VMware) and imported them into another ESXi server, once I boot up the 
loadbalacner it works perfectly... I'm very stumped why this would happen, is 
there any additional logging you can think of that I might want to enable to 
see where the exact problem is?

Here are my configs:

 
/etc/ha.d/ldirectord.conf

checktimeout=10
checkinterval=2
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=yes
virtual=10.10.10.10:67
        real=backend_server01:67 masq
        real=backend_server02:67 masq
        protocol=udp
        checktype=ping
        scheduler=rr
virtual=10.10.10.10:68
        real=back_endserver01:68 masq
        real=backend_server02:68 masq
        protocol=udp
        checktype=ping
        scheduler=rr


I had to rewrite the ldirectord start script and added the following lines in 
the start and restart sections:

        ipvsadm -E -u 10.10.10.10:67 -o -s rr
        ipvsadm -E -u 10.10.10.10:68 -o -s rr


Here is the output of ipvsadm -L -n when both backend servers are up (working 
environment):


IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67            Masq    1      0          16731     
  -> backend_server02:67            Masq    1      0          17447     
UDP  192.168.181.67:68 rr ops
  -> backend_server01:68            Masq    1      0          0         
  -> backend_server02:68            Masq    1      0          0         

Here is the output of ipvsadm -L -n when both backend servers are up 
(non-working environment):

[root@lb01 log]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67                 Masq    1      0          1         
  -> backend_server02:67                 Masq    1      0          0         
UDP  10.10.10.10:68 rr ops
  -> backend_server01:68                 Masq    1      0          0         
  -> backend_server02:68                 Masq    1      0          0         


The only difference I see is that in my "Working" environment my InActConn 
number increases as I send load through it, in my "Non-Working" environment the 
InActConn stays at 1 the entire time.. Another difference is that in the 
"Working" environment I am using a DHCP load testing tool one of my developers 
wrote, whereas in the "NON-Working" environment we are actually getting DHCP 
traffic from another network device... 





Brian Carpio 
Senior Systems Engineer

Office: +1.303.962.7242
Mobile: +1.720.319.8617
Email: bcarpio@xxxxxxxxxxxx


-----Original Message-----
From: Brian Carpio 
Sent: Thursday, April 15, 2010 1:57 PM
To: Simon Horman
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: RE: [Linux-HA] UDP / DHCP / LDIRECTORD

Simon,

Thanks again for all of your hard work, I have sent over a million UDP DHCP 
packets at the new kernel/ipvsadm with the patches applied and currently the 
only issue (which you know about already) is that ldirectord doesn't know about 
the -o option which causes a slight issue with heartbeat (but I just put in a 
cheap fix in my ldirectord start script to edit the services created by 
ldirectord).. 

So not only have I sent over 1,000,000 packets to this setup but I have also 
sent them as fast as 10 packets every 3 milliseconds, I plan to do a long term 
week long test but I don't foresee any issues.. 

Let me know if there is any other testing you would like us to do.. or if you 
would like me to send out the kernel-2.6.18-128 with the patch and the 
ipvsadm-1.24-10 rpm with the patch.. 

Thanks again Simon you are the man!!

Brian Carpio



-----Original Message-----
From: Simon Horman [mailto:horms@xxxxxxxxxxxx] 
Sent: Monday, April 12, 2010 8:56 PM
To: Brian Carpio
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

Hi Brian,

here are some patches to test.
I have only lightly tested them to the extent that they compile and appear to 
configure a valid service.

You can enable one packet scheduling (OPS) by passing the -o option to ipvsadm 
when creating a virtual service.

        e.g.

        # ipvsadm -A -u 172.17.60.211:80 -o
        # ipvsadm -L -n
        IP Virtual Server version 1.2.1 (size=4096)
        Prot LocalAddress:Port Scheduler Flags
          -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
        UDP  172.17.60.211:80 wlc ops

There are three patches:

ops-kernel-2.6.18-128.el5.patch: Patch against CentOS-5.3's 2.6.18-128 kernel.
ops-ipvsadm-1.24-10: Patch against CentOS-5.3's ipvsadm 1.24-10.
ops-ipvsadm-1.24: Patch against upstream ipvsadm 1.24

I have not up-ported the code to the 2.6.33 kernel and ipvsadm 1.25 yet.


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.801 / Virus Database: 271.1.1/2808 - Release Date: 04/13/10 
00:32:00
<Prev in Thread] Current Thread [Next in Thread>