LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] IPVS sync behavior

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] IPVS sync behavior
From: Dan Yocum <yocum@xxxxxxxx>
Date: Tue, 29 Apr 2008 11:12:25 -0500

David Black wrote:
> Yes, I have several IPVS pairs running, some on vanilla 2.6.17.7, others
> on 2.6.16-xen (Xen 3.0), 2.6.18-xen (Xen 3.1) and 2.6.9-57.EL (CentOS
> 4).  All show the same +1.0 load av behavior when
> ipvs-syncmaster/-syncbackup are running.

We're using 2.6.18-xen and the softlockup_tick errors are only occurring 
when running lvs on dom0.  LVS works fine on a host domain (domU).  Just 
be forewarned.

The stack trace always starts with the following few lines before diverging:

kernel: <IRQ> [<ffffffff80258269>] softlockup_tick+0xcc/0xde
kernel: [<ffffffff8020e84d>] timer_interrupt+0x3a3/0x401
kernel: [<ffffffff80258898>] handle_IRQ_event+0x4b/0x93
kernel: [<ffffffff8025897e>] __do_IRQ+0x9e/0x100
kernel: [<ffffffff8020cc97>] do_IRQ+0x63/0x71
kernel: [<ffffffff8034b347>] evtchn_do_upcall+0xee/0x165
kernel: [<ffffffff8020abca>] do_hypervisor_callback+0x1e/0x2c
...


> 
> Since you mention it, I did have problems with heartbeat on Xen - no
> network lockups but just heartbeat being fussy about timing, and decided
> to try keepalived (VRRP)/IPVS, which solved at least the timing issues. 
> No kernel issues as you describe with piranha either.

Potentially on topic, we've seen problems with ntp running on the domU 
domains, too.  The dom0 will have the correct time, but the domUs drift 
and won't come back.  'tis strange, and I haven't found a solution for 
this, yet.

IIRC heartbeat from linux-ha.org sends a timestamp which can cause havoc 
if the 2 ha servers are out of sync, time-wise.  I haven't seen this 
issue with the heartbeat that is used in piranha's pulse - maybe it's 
not so picky wrt timestamps - it's happy as long as it received a ping 
within the last 6 seconds.  Maybe keepalived isn't so picky, either.

Cheers,
Dan

> 
> Dave
> 
> Dan Yocum wrote:
>> Hi Dave,
>>
>>
>> Hopefully you don't have ipvs or lvs running on your dom0?  Before I 
>> knew any better I put the LVS directors on 2 dom0s and ended up with 
>> lots of softlockup_tick kernel "panics" which would invariably bring the 
>> network to a screeching halt on domUs for several seconds - long enough 
>> for nanny (I'm using piranha) to mark a server as offline.
>>
>> Moving the LVS directors to their own xen VM solved these kernel lockups 
>> and network problems.
>>
>> I'm wondering if your first point may have something to do with this 
>> problem.
>>
>> Cheers,
>> Dan
>>
>>
>>
>>   
> 
> 
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users

-- 
Dan Yocum
Fermilab  630.840.6509
yocum@xxxxxxxx, http://fermigrid.fnal.gov
Fermilab.  Just zeros and ones.


<Prev in Thread] Current Thread [Next in Thread>