I've been battling with this for the last few days. We have the
following scenario going.
We have 2 physical nodes set up in an LVS configuration which is running
OK. pulse on the backend is being piped through an SSH based connection.
/etc/lvs.cf is correctly configured and the 2 nodes talk just fine. BUT,
every so often we lose the DNS running on the machine and routing to the
outside world. I have the DNS records set up to point at the LVS node (the
"fake" machine process) as the DNS which is really running on both of the
actual physical nodes. (This should in theory roll the DNS over to either
machine if/when a node goes down.)
On top of that, for some reason SSH keeps losing the ability to talk
across the pulse (heartbeat?) when a node goes down. More accurately it
fails to be able to REcommunicate when the downed node comes back
online. it wants to reverify the password from all I can tell.)
Lastly, when it DOES work, for some reason it intermittently loses network
connection. (I think this is occuring every few hundred heartbeats.) I'm
not really sure what's going on here. In case you haven't guessed this is
my first time setting up an LVS in a commercial environment and realize I
seem to be missing a few clues. Anything that will help me keep this
stabilized in a production environment would be emmensely helpful.
David D.W. Downey RHCE, Internet Security Specialist
Sr. Linux System Administrator QIXO, Inc. San Jose, CA