LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

RE: [Linux-HA] UDP / DHCP / LDIRECTORD

To: General Linux-HA mailing list <linux-ha@xxxxxxxxxxxxxxxxxx>, 'Simon Horman' <horms@xxxxxxxxxxxx>
Subject: RE: [Linux-HA] UDP / DHCP / LDIRECTORD
Cc: 'lvs-devel' <lvs-devel@xxxxxxxxxxxxxxx>, 'Julian Anastasov' <ja@xxxxxx>
From: Brian Carpio <bcarpio@xxxxxxxxxxxx>
Date: Mon, 25 Apr 2011 03:30:21 -0700
Hi,

It looks like there also might be a memory leak in this patch.. over the last 
few months we have seen memory grow slowly but lately the traffic has increased 
and the memory utilization of the Linux box is now growing faster. I put in a 
few scripts to try and detect where this memory leak was coming from and when 
watching /proc/meminfo over the last few days I saw that slab was growing. 

So I put in a new script to watch slabtop and I can see that ip_vs_conn is 
growing. The number of SLABS just grows and grows, and so does the CACHE_SIZE.  
Is there any way you have a chance to look into this for us? Any additional 
information I can give to you about this problem?

Thanks a lot,
Brian Carpio

-----Original Message-----
From: linux-ha-bounces@xxxxxxxxxxxxxxxxxx 
[mailto:linux-ha-bounces@xxxxxxxxxxxxxxxxxx] On Behalf Of Brian Carpio
Sent: Friday, February 25, 2011 12:14 PM
To: General Linux-HA mailing list; 'Simon Horman'
Cc: 'lvs-devel'; 'Julian Anastasov'
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

Apparently this is related to some sort of race condition (possibly a problem 
with my ldirectord start script which does an edit on the ipvsadm config after 
ldirectord has started) if ldirectord starts to receive traffic on port 67/68 
before the following commands are run:

        ipvsadm -E -u 10.10.10.10:67 -o -s rr
        ipvsadm -E -u 10.10.10.10:68 -o -s rr

Then it will be stuck sending traffic to the fist server in the list. 



Brian Carpio 
Senior Systems Engineer

Office: +1.303.962.7242
Mobile: +1.720.319.8617
Email: bcarpio@xxxxxxxxxxxx


-----Original Message-----
From: linux-ha-bounces@xxxxxxxxxxxxxxxxxx 
[mailto:linux-ha-bounces@xxxxxxxxxxxxxxxxxx] On Behalf Of Brian Carpio
Sent: Thursday, February 24, 2011 3:47 PM
To: 'Simon Horman'
Cc: 'lvs-devel'; 'Julian Anastasov'; 'linux-ha@xxxxxxxxxxxxxxxxxx'
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

All,

So this patch has been working for us flawlessly for the last 5 months or so. 

Our infrastructure is 100% virtualized, the other day our loadbalacner01 had a 
memory leak and crashed, since we use ldirectord with heartbeat loadbalacner02 
took over, however ever since then it seems like the single packet UDP 
scheduling has stopped working. Even if I fail back over the loadbalacner01 VM, 
I still see all the DHCP traffic going to only one backend server. 

If I run ipvsadm -L -n I can see that ipvsadm thinks both of the backend 
servers are up since the weight is set to 1 for each server, if I reboot the 
second backend server the one which is not receiving any traffic then run 
ipvsadm -L -n I can see its weight go to 0 and in the ldirectord log I can see 
that its marked dead. 

I have exported one of the loadblancers and one of the backend servers (using 
VMware) and imported them into another ESXi server, once I boot up the 
loadbalacner it works perfectly... I'm very stumped why this would happen, is 
there any additional logging you can think of that I might want to enable to 
see where the exact problem is?

Here are my configs:

 
/etc/ha.d/ldirectord.conf

checktimeout=10
checkinterval=2
autoreload=yes
logfile="/var/log/ldirectord.log"
quiescent=yes
virtual=10.10.10.10:67
        real=backend_server01:67 masq
        real=backend_server02:67 masq
        protocol=udp
        checktype=ping
        scheduler=rr
virtual=10.10.10.10:68
        real=back_endserver01:68 masq
        real=backend_server02:68 masq
        protocol=udp
        checktype=ping
        scheduler=rr


I had to rewrite the ldirectord start script and added the following lines in 
the start and restart sections:

        ipvsadm -E -u 10.10.10.10:67 -o -s rr
        ipvsadm -E -u 10.10.10.10:68 -o -s rr


Here is the output of ipvsadm -L -n when both backend servers are up (working 
environment):


IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler 
Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67            Masq    1      0          16731     
  -> backend_server02:67            Masq    1      0          17447     
UDP  192.168.181.67:68 rr ops
  -> backend_server01:68            Masq    1      0          0         
  -> backend_server02:68            Masq    1      0          0         

Here is the output of ipvsadm -L -n when both backend servers are up 
(non-working environment):

[root@lb01 log]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler 
Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
UDP  10.10.10.10:67 rr ops
  -> backend_server01:67                 Masq    1      0          1         
  -> backend_server02:67                 Masq    1      0          0         
UDP  10.10.10.10:68 rr ops
  -> backend_server01:68                 Masq    1      0          0         
  -> backend_server02:68                 Masq    1      0          0         


The only difference I see is that in my "Working" environment my InActConn 
number increases as I send load through it, in my "Non-Working" environment the 
InActConn stays at 1 the entire time.. Another difference is that in the 
"Working" environment I am using a DHCP load testing tool one of my developers 
wrote, whereas in the "NON-Working" environment we are actually getting DHCP 
traffic from another network device... 





Brian Carpio
Senior Systems Engineer

Office: +1.303.962.7242
Mobile: +1.720.319.8617
Email: bcarpio@xxxxxxxxxxxx


-----Original Message-----
From: Brian Carpio
Sent: Thursday, April 15, 2010 1:57 PM
To: Simon Horman
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: RE: [Linux-HA] UDP / DHCP / LDIRECTORD

Simon,

Thanks again for all of your hard work, I have sent over a million UDP DHCP 
packets at the new kernel/ipvsadm with the patches applied and currently the 
only issue (which you know about already) is that ldirectord doesn't know about 
the -o option which causes a slight issue with heartbeat (but I just put in a 
cheap fix in my ldirectord start script to edit the services created by 
ldirectord).. 

So not only have I sent over 1,000,000 packets to this setup but I have also 
sent them as fast as 10 packets every 3 milliseconds, I plan to do a long term 
week long test but I don't foresee any issues.. 

Let me know if there is any other testing you would like us to do.. or if you 
would like me to send out the kernel-2.6.18-128 with the patch and the 
ipvsadm-1.24-10 rpm with the patch.. 

Thanks again Simon you are the man!!

Brian Carpio



-----Original Message-----
From: Simon Horman [mailto:horms@xxxxxxxxxxxx]
Sent: Monday, April 12, 2010 8:56 PM
To: Brian Carpio
Cc: linux-ha@xxxxxxxxxxxxxxxxxx; lvs-devel; Julian Anastasov
Subject: Re: [Linux-HA] UDP / DHCP / LDIRECTORD

Hi Brian,

here are some patches to test.
I have only lightly tested them to the extent that they compile and appear to 
configure a valid service.

You can enable one packet scheduling (OPS) by passing the -o option to ipvsadm 
when creating a virtual service.

        e.g.

        # ipvsadm -A -u 172.17.60.211:80 -o
        # ipvsadm -L -n
        IP Virtual Server version 1.2.1 (size=4096)
        Prot LocalAddress:Port Scheduler Flags
          -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
        UDP  172.17.60.211:80 wlc ops

There are three patches:

ops-kernel-2.6.18-128.el5.patch: Patch against CentOS-5.3's 2.6.18-128 kernel.
ops-ipvsadm-1.24-10: Patch against CentOS-5.3's ipvsadm 1.24-10.
ops-ipvsadm-1.24: Patch against upstream ipvsadm 1.24

I have not up-ported the code to the 2.6.33 kernel and ipvsadm 1.25 yet.


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.801 / Virus Database: 271.1.1/2808 - Release Date: 04/13/10 
00:32:00 _______________________________________________
Linux-HA mailing list
Linux-HA@xxxxxxxxxxxxxxxxxx
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA@xxxxxxxxxxxxxxxxxx
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

N?§²æìr¸?yúè?Øb²X¬¶Ç§vØ^?)Þº{.nÇ+?·¥¾Ç^½éb?Ø^n?r¡ö¦zË?ëh?¨è­Ú&¢ø®G«?éh®(­é???Ý¢j"?ú¶m§ÿï?êäz¹Þ??àþf£¢·h??§~?m?
<Prev in Thread] Current Thread [Next in Thread>