Hello,
On Tue, 5 Sep 2000, Derek Glidden wrote:
>
> Hello,
> We've recently set up a couple of LVS boxes based on RedHat's "HA
> Server" code (LVS kernel patches (they appear to be 0.9.14) + nanny for
> server monitoring + pulse for primary/secondary LVS heartbeat) and have
> noticed a couple of odd things about resource utilization. Basically,
> the servers really suck down file-handles and inodes, to the point that
> we've had to increase /proc/sys/fs/file-max and /proc/sys/fs/inode-max
> to some incredibly high values to prevent the servers from crashing with
> errors like:
>
> VFS: file-max limit 4096 reached
> kernel: grow_inodes: inode-max limit reached
> kernel: socket: no more sockets
This is possible for web servers where many sockets are
used, most of them in TIME_WAIT/FIN_WAIT state. There are
some TCP problems in Linux 2.2.14, I don't know what is the
kernel you use. You have to upgrade to the latest kernel and packages.
>
> increasing the values seems to "cure" the problem, but I'm wondering if
> this is a "known issue" that running an LVS box with also a number of
> ipportfw configs requires that these values be increased into the
> hundreds-of-thousands range? I've looked through search engines and
No. LVS, MASQ and other kernel software don't use sockets and
files at all. The sockets are used from processes. Changing the LVS
methods, etc. can't help. There is a problem in the user space or
kernel bug.
> searched the LVS archives, but only found generic references to the
> "file-max limit reached" error, nothing specific to
> LVS/ipchains/ipmasq/ipportfw applications.
I have never changed these values for the LVS director,
only for the web servers.
>
> Our configuration is currently ~50 ipportfw rules, MASQ for 4 class C
> networks, and ~20 LVS configs doing "Failover" between a primary and a
> backup server. (i.e. each VIP has a primary and a backup "real" server
> that are weighted 100 and 1 respectively, doing WRR balancing.)
>
> Is it crazy that we should see _used_ (not just "allocated")
> file-handles on a config like this grow into the thousands and nearing
> the ten-thousands mark? Or does something somewhere, possibly one of
> the RedHat-supplied packages, have a leak that I should try to track
> down?
When you increase these values are the limits reached again?
>
> --
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> With Microsoft products, failure is not Derek Glidden
> an option - it's a standard component. http://3dlinux.org/
> Choose your life. Choose your http://www.tbcpc.org/
> future. Choose Linux. http://www.illusionary.com/
Regards
--
Julian Anastasov <ja@xxxxxx>
|