Hello,
We've recently set up a couple of LVS boxes based on RedHat's "HA
Server" code (LVS kernel patches (they appear to be 0.9.14) + nanny for
server monitoring + pulse for primary/secondary LVS heartbeat) and have
noticed a couple of odd things about resource utilization. Basically,
the servers really suck down file-handles and inodes, to the point that
we've had to increase /proc/sys/fs/file-max and /proc/sys/fs/inode-max
to some incredibly high values to prevent the servers from crashing with
errors like:
VFS: file-max limit 4096 reached
kernel: grow_inodes: inode-max limit reached
kernel: socket: no more sockets
increasing the values seems to "cure" the problem, but I'm wondering if
this is a "known issue" that running an LVS box with also a number of
ipportfw configs requires that these values be increased into the
hundreds-of-thousands range? I've looked through search engines and
searched the LVS archives, but only found generic references to the
"file-max limit reached" error, nothing specific to
LVS/ipchains/ipmasq/ipportfw applications.
Our configuration is currently ~50 ipportfw rules, MASQ for 4 class C
networks, and ~20 LVS configs doing "Failover" between a primary and a
backup server. (i.e. each VIP has a primary and a backup "real" server
that are weighted 100 and 1 respectively, doing WRR balancing.)
Is it crazy that we should see _used_ (not just "allocated")
file-handles on a config like this grow into the thousands and nearing
the ten-thousands mark? Or does something somewhere, possibly one of
the RedHat-supplied packages, have a leak that I should try to track
down?
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
With Microsoft products, failure is not Derek Glidden
an option - it's a standard component. http://3dlinux.org/
Choose your life. Choose your http://www.tbcpc.org/
future. Choose Linux. http://www.illusionary.com/
|