LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: directors hanging with master daemon

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: directors hanging with master daemon
From: kgaliy@xxxxxxxxxxx
Date: Thu, 4 Jul 2002 09:16:17 +0400
I have the same problem, without daemon all working perfectly.
After synchronisation daemon was enabled master node stayed one night.

RH 7.2 glibc-2.2.4-13, gcc-2.96-98
ipvsadm v1.20 2001/11/04 (compiled with popt and IPVS v1.0.0)




Chris Beauchamp <cb@xxxxxxxxxxxxxxxx>@LinuxVirtualServer.org on 03.07.2002
15:17:13

Please respond to lvs-users@xxxxxxxxxxxxxxxxxxxxxx

Sent by:    lvs-users-admin@xxxxxxxxxxxxxxxxxxxxxx



To:    lvs-users@xxxxxxxxxxxxxxxxxxxxxx
cc:
Subject:    directors hanging with master daemon




Dear all,

On Friday, I enabled the synchronisation daemon on our two machine
director cluster. We're using heartbeat to failover from the active
one to the standby.

2am on Saturday, the primary failed. The (serial) console wasn't
responding to anything but sysrq, and then only to reboot. (the
heartbeat didn't properly failover, but that's another story). The
primary was restarted, but failed again at 8:30am, at which point the
secondary took over... which then failed the same way at 2pm on Sunday
- back to primary, failed again 1:30am, and 2:30am Monday.

I was away over the weekend and realised something was wrong from all
the mon alerts to my mobilephone :-( - its all fairly new, but had
happily run for most of last week, so I realised it was the daemon
stuff that I'd put in on friday...

So today, I'm back at work, with large number of doughnuts to the guys
who were on call over the weekend, investigating what went wrong.

Running:

Kernel 2.4.18, with LVS kernel patch 1.0.2.
Debian Woody (up to date)
ipvsadm: 1.20release6-2 (from the debian package)
Using LVS-DR to route web (and mail, irc, and https) traffic to two
realservers.

Any thoughts? The logs show nothing interesting, the failures weren't
at highly loaded times (2am sees very little traffic), and one of the
failures was only an hour after one of the previous ones.

The failures also only occured on the master daemon - the standby had
exactly the same rules, and was receiving the sync data, but stayed up.

Thanks in advance

Chris

_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://www.in-addr.de/mailman/listinfo/lvs-users






<Prev in Thread] Current Thread [Next in Thread>