On Mon, 27 Nov 2006, Olle Östlund wrote:
Our LVS-backends typically freezes after 5-7 days of normal operation.
These freezes are not system-crashes, but it seems like all new
TCP-connections towards the servers will hang forever. It is impossible
to logon or perform an su (they will hang), but existing sessions will
function fine as long as you don't issue 'critical commands' (commands
which perform a tcp-connection?).
Or commands that cause disk I/O?
such as loging in remotely, but not from the console...?
Remote logon via ssh does not work (hangs or will get a "connection closed by
remote server" type of message). Logging in via the console does not work
(hangs after password has been typed in), using su will produce the same
result. Exisiting logged-in sessions continue to function however. Commands
like the various system status commands (top, netstat, vmstat, ..) works
fine. Commands which are known to log via syslog (su, ssh, ...) will hang.
This sound exactly like problems I've had when with scsi controllers shit
themselves. Ever done a dmesg when it happend and you still had a session?