I've set up a test set of servers in our lab to play with the LVS stuff
to see why we're running into file-handle issues like we have been and
I've discovered something really interesting. (At least to me...) For
our lab setup, we only have a primary LVS server, but still have a lot
of services and whatnot the same way we have at the client site and we
don't have a problem at all with the LVS box using up file-handles. It
seems to be related to the communication between the primary and backup
LVS boxes. Since we're using the RedHat HA code, I'm going to guess
that there's something in 'pulse' for doing monitoring of primary/backup
LVS boxes that has a leak somewhere. I'll try to dig up another box to
set up as a secondary LVS server and see if I can recreate the situation
exactly in our lab.
Also, we've been closely watching our deployed LVS servers and the
behaviour is:
pulse/LVS/nanny start up and allocate a buttload of file-handles on both
the primary and backup servers (~20000 in our case)
over the course of a day or two the primary server slowly releases
handles until the number drops to something reasonable (in our case,
it's gone from ~20000 to 47 in about 36 hours)
the backup still has waaaaay too many file-handles allocated
if the machines switch duties, the behaviour stays consistent with
functionality (i.e. the machine that's now primary slowly releases
handles, etc)
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
With Microsoft products, failure is not Derek Glidden
an option - it's a standard component. http://3dlinux.org/
Choose your life. Choose your http://www.tbcpc.org/
future. Choose Linux. http://www.illusionary.com/
|