Hi!
On Thu, 05 Jul 2007, Gerry Reno wrote:
> Tobias Klausmann wrote:
> > We're currently using keepalived and vanilla 2.6 kernels (which
> > already have LVS, so no patching needed). We're also looking into
> > ldirectord since keepalived has given us some trouble.
> >
> Tobias,
> Are you still having the same catatonic problem? Or is this something new?
It's similar, yet different.
First, it seems it's no longer triggered by config reloads but
"just happens". Also, it happens very infrequently, maybe once a
month, probably even less often - that is, over the five[0]
productive and one test LBs, so statistically, it probably
happens once or twice a year on a single LB.
[0] We have 10+1 servers, five pairs with one productivem one
standby plus one testing server. The way we switch things, a
catatonic test server will pretty much go unnoticed.
As such, it's pretty much impossible to reproduce. The symptoms
are slightly different, to: keepalived *looks* okay, but it just
doesn't see when a server disappears. Also, it eventually starts
ignoring HUP completely. It's not completely frozen though: it
keeps doing checks.
Another odd thing I've witnessed: if you tell keepalived to bind
to an IP (for the checks) that is'nt configured, it will complain
a bit but still continue trying - and leaving everything
inservice. I think it should either complain more loudly or take
everything out of service as not being able to check is about the
same as everything being down.
Regards,
Tobias
--
In the future, everyone will be anonymous for 15 minutes.
|