I have an LVS cluster up and running using keepalived. The cluster has two
load balancers and about 7 real servers configured for direct routing. The
cluster has been up and running for about 2 months will very little trouble.
All of a sudden late friday night, traffic to the cluster froze up
completely. Absolutely no traffic was being routed to the real servers. I
manually forced a failover by shutting down keepalived on the primary load
balancer and traffic resumed as expected. About 6 hours later the same
thing happened again. Again, I forced a failover back to the original load
balancer and traffic immediately resumed. Since then this has happened
about 5 more times.
What could be causing this to happen? Have any of you seen this before?
Any help would be much appreciated.
Thanks,
Micah
|