No great knowledge to impart, but I'm curious what hardware you use.
Our directors have experienced something similar and I've blamed either
a counter problem or an ethernet driver bug. Stopping keepalived &
lvs; unloading the netfilter, ipvs, & ethernet driver modules;
reloading all modules; then finally restarting services fixes
everything.
The directors are ibm xseries 345's w/ intel copper gigabit (e1000
driver) adapters.
Thanks,
-jrr
On Mon, 2004-01-12 at 18:45, Micah Abrams wrote:
> Both the secondary and the master are doing the same thing intermittently.
> The strange thing is that servers have been running fine for about 2 months.
> The the master LB has a completely different hardware configuration then
> does the backup. The only common piece of hardware is the hub to which all
> the servers connect.
>
> When the cluster does fail, only the front (incoming) side of the cluster is
> failing. What I mean is that the real servers continue to pass the
> keepalived tcp checks and all the real servers remain in the cluster
> (ipvs -L lists all the realservers). Unfortunately, no incoming traffic is
> routed to the real servers. If I simply ssh in and restart keepalived, the
> cluster is brought back online right away and normal traffic resumes.
|