First, apologies for the cross-post. But ldirectord lives half way
between LVS and Heartbeat and this is a bug that is probably affecting a
number of people. Thanks to the people who helped me to track this down.
A number of people have reported that the memory footprint of ldirecord
grows with time when https services are being monitored using a
negotiate check. It turns out that this is cauesd by three memory leaks,
two in Net::SSLeay Perl and one in Perl itself. So while ldirectord
itself does not have a memory leak, some of the code it is relying in is
letting it down.
The details of the problems are as follows:
1. Memory leak in Perl's socket code.
As per perl ticket #16306 there is a memory leak in the
socket call in the default io subsystem for perl on Unix.
http://archive.develooper.com/perl5-porters@xxxxxxxx/msg85468.html
A work around has been provided, which switches to the alternative
perlio subsystem which does not exhibit this bug (though may have
other, unknown problems).
http://archive.develooper.com/perl5-porters@xxxxxxxx/msg85496.html
http://dev.perl.org/perl5/list-summaries/p5p-200208-4.pod.html
I have modified ldirectord to make use of this work-around
until a more permanent fix is supplied by perl. This change
is in the patch attached and has been committed to cvs as
version 1.60.
It amuses me somewhat that this, to my mind serious bug, has
been around since August. But that so far there only
seems to be a work-around available.
2. Memory leak in Net::SSLeay Perl's use of load_error_strings()
This leak is caused by repeated calls to load_error_strings().
Evedently the underlying OpenSSL call allocates memory each the call
is made. I resolved this problem by making a wrapper function which
will only call the underlying function once. This should be
sufficient as load_error_strings() should only need to be called
once, but it it is more convenient to call it more freely in the perl
module.
This is fixed in the attached patch to Net::SSLeay Perl 1.21 which I
have submitted to the author. I have also successfully patched 1.20.
Other, earlier versions will probably also patch cleanly.
3. Memory leak caused by not freeing X509 structure
This leak is caused by not freeing the X509 structure returned by
get_peer_certificate(). In particular, both do_https2() and
do_https4() discard this structure when it is returned by
do_https3(). I have resolved this problem bu calling X509_free() in
each of do_https2() and do_https4().
This is also fixed in the attached patch to Net::SSLeay Perl 1.21
Each of these three memory leaks contributed to ldirectord consuming
more and more memory over time.
ldirectord is a daemon to monitor and administer real servers
in a LVS cluster of load balanced virtual servers. ldirectord typically
used as a resource for heartbeat , but can also be run from the command
line.
Information on obtaining the latest and greatest version of ldirectord
can be found at http://www.vergenet.net/linux/ldirectord/
--
Horms
ldirectord.1.59-1.60.patch
Description: Text document
Net_SSLeay.pm-1.21.leak.patch
Description: Text document
|