I don't remember where, but I think there's a deadlock fix in 2.6.12 you
might want to backport to your kernel and give that a try.
Roger
On 9/9/05, Luca Maranzano <liuk001@xxxxxxxxx> wrote:
>
> Hi again, some more tests report.
>
> I've tried 2.6.8-smp and CPU load is 0.0, synchronization seems to
> work, but when I do a "heartbeat stop" to force the failover to backup
> node node2, node2 suddenly locked! no messages at console, a hard
> reset was needed.
>
> So I've disabled Hyperthreading on both nodes from the BIOS (they are
> HP DL380G4), installed 2.6.11 UP (always Debian Kernel) and the load
> is always high as with SMP kernel setup (2.00 or more), but failover
> and failback works fine.
>
> One note about hardware: the boxes have 2 Intel Dual-Gbit PCI NIC
> while the onboard NIC are 2 Broadcom NetXtreme BCM5704. With the 2.6.8
> kernel the driver was tg3, while with 2.6.11 the driver is bcm5700. I
> don't know if this could be an issue.
>
> I'm a bit worried about this behaviour.
>
> Is there someone with a similar setup who can report about hardware
> and software version?
>
> TIA.
> Kind regards,
> Luca
>
> On 09/09/05, Luca Maranzano <liuk001@xxxxxxxxx> wrote:
> > Thanks to all for the replies.
> >
> > For Bruce Rosenthal: I cannot strace the ipvs_syncmaster since it is a
> > kernel thread and strace IMVHO doesn't work.
> >
> > I'm doing some tests with Debian kernel 2.6.8-2-686-smp and the
> > daemons starts and CPU load is very low (near 0.0) as it should be.
> >
> > For Horms: I've tryed also non-SMP kernel 2.6.11 (always debian) and
> > the result is the same (100% cpu hog). May be I should try to disable
> > Hyperthreading to be sure that it is not the cause. Are you using a
> > Debian Kernel 2.6.11 or some other version? (Debian kernels are
> > slightly different from stock Linus' kernels)
> >
> > More later.
> > Regards,
> > Luca
> >
> >
> >
> >
> > On 09/09/05, Horms <horms@xxxxxxxxxxxx> wrote:
> > > On Thu, Sep 08, 2005 at 07:05:23PM -0400, Roger Tsang wrote:
> > > > It has to do with ssleep() waiting in IO. You can tell with ps long
> format
> > > > output. You'd have to switch over to schedule_timeout() like they
> used to do
> > > > it in kernel-2.4 ipvs. That's what I found on Fedora with kernel-2.6ipvs
> > > > and think you're hitting the same problem.
> > >
> > > That sounds bad. Does anyone know if this is specific to 2.6.11,
> > > specific to hyperthreading? I certainly don't see it on my UP 2.6.11
> > > box. I would like to investigate further, but as always, time is
> against
> > > me.
> > >
> > > --
> > > Horms
> > > _______________________________________________
> > > LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> > > Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> > > or go to http://www.in-addr.de/mailman/listinfo/lvs-users
> > >
> >
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users
>
|