LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: file-max/inode-max question

To: Derek Glidden <dglidden@xxxxxxxxxxxxxxx>
Subject: Re: file-max/inode-max question
Cc: Karl <karl.mueller@xxxxxxxxxxxxxx>, LVS List <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: "P.Copeland" <copeland@xxxxxxxxxx>
Date: Tue, 05 Sep 2000 16:02:04 -0500
Derek Glidden wrote:

> Karl wrote:
> >
> > Derek Glidden wrote:
> > >
> > > Is it crazy that we should see _used_ (not just "allocated")
> > > file-handles on a config like this grow into the thousands and nearing
> > > the ten-thousands mark?  Or does something somewhere, possibly one of
> > > the RedHat-supplied packages, have a leak that I should try to track
> > > down?

There could be a leak.
Could you tell me what version of the kit you are using?
eg  0.4.16-7
you'll get it from issuing
    rpm -q piranha

There are updates available for the Redhat kit. You'll find them
on
    ftp://people.redhat.com/kbarrett/HA
for the current stable release
and
    ftp://people.redhat.com/kbarrett/HA/experimental
for the current 'I'm pretty damned sure it works' kit

>
> > >
> >
> > you might want to run lsof and see what process(es) are maintaining
> > large numbers of open files.  we had a problem similar to this that was
> > caused by some errant logging in our code.
>
> Here's the problem:
>
> cat /proc/sys/fs/file-nr
> 16793   16555   262144
>
> lsof | wc -l
> 394
>

Thats a lot of files

> So there's 16161 handles that the kernel is reporting as actively used
> that don't show up from lsof.  From what I understand, the numbers from
> the proc filesystem are "file handles allocated, file handles in use,
> and file-max" respectively.  I may be wrong about these numbers, but
> after the first crash, we increased file-max from default of 4096 to
> 16384, whereupon it ran for three whole days before crashing again with
> "file-max" and "inode-max limit reached" errors, at which point we
> increased both values to 262144 as it currently stands, so I'm betting
> that we really are using 16K filehandles as it shows.  The numbers also
> seem to steadily increase over the life of the LVS, at least up to a
> point, although I haven't been able to keep a close enough eye on the
> box over a long enough period of time to see if it's based on number of
> connections, number of ipvs/ipchains rulesets, lifetime of the box, etc.

I'm rebuilding the test rack I use in the lab, I'l;l leave something nasty
running over it for the next few days to see if I can get a similar problem

Phil
=--=

<Prev in Thread] Current Thread [Next in Thread>