LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] IPVS sync behavior

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] IPVS sync behavior
From: Morgan Fainberg <morgan@xxxxxxxxxxxxxxx>
Date: Mon, 28 Apr 2008 18:08:27 -0700
The issue that you're seeing (load of 1, specifically) is due to some  
oddness in the ssleep() code on some systems (I've not been able to  
duplicate it on all systems with similar specs).  I typically see the  
issue with a Xeon based multi-core system.  The simplest solution is  
to revert the kernel code for IP_VS_SYNC to use the older  
'__set_current_state(TASK_INTERRUPTIBLE);' instead of 'ssleep(1)'.

Technically speaking, from what I've observed the load is erroneous  
and doesn't cause significant (if any) degradation of performance  
(again on a multi-cpu system).  I have used a 'patched' kernel to make  
the load appear more "normal".  As far as I am seeing, the load is  
erroneous when it is pegged at 1 with the IP_VS_SYNC code as is.

Essentially the patch simply replaces all instances of 'ssleep(1);' in  
net/ipv4/ipvs/ip_vs_sync.c with the following:

        __set_current_state(TASK_INTERRUPTIBLE);
        schedule_timeout(HZ);
        __set_current_state(TASK_RUNNING);


Your milage may vary in performance.  I cannot make any claims as to  
if the use of ssleep(1); in this manner is in-fact an "incorrect  
usage", just that changing this out does make the load appear more  
normal.

--
Morgan Fainberg
Systems Architect
(mt) Media Temple, Inc
http://www.mediatemple.net/

On Apr 28, 2008, at 7:46 AM, David Black wrote:

> Last week I decided to move forward with testing, then implementing
> connection sync in my IPVS production setup.  The configuration is
> LVS-NAT with keepalived and two (failover pair) directors.  Kernel is
> vanilla 2.6.17.7 and keepalived 1.1.13, the latter a custom build of  
> an
> RPMforge source RPM.  This setup has been completely stable over
> approximately the past two years.
>
> I'd like to share some observations with the list and see if anyone  
> has
> comments:
>
> . When initially bringing up the sync daemon on the standby  
> director, as
> expected the daemon comes up as ipvs-syncmaster.   When I next brought
> it up on the primary director using a service keepalived reload or
> restart, the standby indefinitely remained in master mode.  It appears
> to take a VRRP failover/failback event (e.g. a sleep 2 between
> keepalived stop/start on the primary) for keepalived to make
> ipvs-syncmaster on the standby to switch to ipvs-syncbackup.  Is this
> normal behavior?
>
> . When ipvs-sync* is running, the box maintains a load av of 1.0.   I
> read this has something to do with the kernel syncdaemon code using  
> the
> wrong sleep function.  Anyone know in which kernel this might be fixed
> (beyond 2.6.17.7)?   I might just patch the existing kernel since it  
> was
> built from source anyway, and has proven otherwise quite stable.   
> But I
> have some other IPVS boxes running 2.6.18-xen (as a dom0).
>
> . For an FTP server behind the director, IPVS data connection sync  
> info
> appears in the standby if passive mode is used (which we do), but at
> least in initial testing I couldn't get the data connection to keep
> going through a failover event.   The control connection fails over
> normally.  Was this an anomaly in my testing, or *should* it work?  I
> wonder if it might have to do with some portion of the needed state in
> the ipfilter connection table and hence not replicated.
>
> Cheers,
> Dave
>
>
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users



<Prev in Thread] Current Thread [Next in Thread>