Julian Anastasov wrote:
>
> This is possible for web servers where many sockets are
> used, most of them in TIME_WAIT/FIN_WAIT state. There are
> some TCP problems in Linux 2.2.14, I don't know what is the
> kernel you use. You have to upgrade to the latest kernel and packages.
The kernel version on the LVS box is 2.2.16.
> No. LVS, MASQ and other kernel software don't use sockets and
> files at all. The sockets are used from processes. Changing the LVS
> methods, etc. can't help. There is a problem in the user space or
> kernel bug.
Ok, that's good information to have then. I didn't think the
kernel-mode stuff (LVS, ipchains, etc) should be allocating handles at
all but wasn't sure.
> > searched the LVS archives, but only found generic references to the
> > "file-max limit reached" error, nothing specific to
> > LVS/ipchains/ipmasq/ipportfw applications.
>
> I have never changed these values for the LVS director,
> only for the web servers.
That's also good information. Then we probably have something with a
descriptor leak somewhere. I'll probably look into some other HA tools
than the RedHat stuff I'm using at the moment and see if that clears up
our problem.
> > Is it crazy that we should see _used_ (not just "allocated")
> > file-handles on a config like this grow into the thousands and nearing
> > the ten-thousands mark? Or does something somewhere, possibly one of
> > the RedHat-supplied packages, have a leak that I should try to track
> > down?
>
> When you increase these values are the limits reached again?
They have been when I only increased the values to 16K, but not since we
increased it to the 256K range. It acts like something only leaks up to
a certain point, and then stops...
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
With Microsoft products, failure is not Derek Glidden
an option - it's a standard component. http://3dlinux.org/
Choose your life. Choose your http://www.tbcpc.org/
future. Choose Linux. http://www.illusionary.com/
|