On Fri, Nov 10, 2006 at 07:30:20PM +0100, Roberto Nibali wrote:
> >Just migrated another customer environment from some older
> >Ldirectord 1.77.2.6 to a more recent version 1.77.2.45.
There have been a number of updates since 1.77.2.45.
Could you please try 1.186, which is available at
http://www.vergenet.net/linux/ldirectord/download/ldirectord.1.186
I have also put an updated explanation
of how to get the latest ldirectord (which is 1.186 today),
at http://www.vergenet.net/linux/ldirectord/download.shtml
>
> I've looked at the code for 5 minutes, simply to see what we're talking about
> ... crazy coding :).
>
> >I have a problem with the more recent Ldirectord 1.77.2.45. It crashes
> >unexpected without any message in the log. Sometimes this happens after a
> >few hours and sometimes a few weeks (2-3) after it has been started. Up to
> >now, I could not figure out the problem.
>
> What kind of crash is it?
>
> >But I assume it must have something to do with the high real server
> >fluctuation within the environment. That means the real servers are taken
> >out of the load balancing because of hanging processes on the real servers
> >very often. After restart of these processes on the real servers,
> >ldirectord spotlessly put them back into the virtual server. What I can
> >see is, that ldirectord seems to crash, when there is some kind of a peak
> >in the fluctuation.
> >History:
> > To check/monitor HTTPS connections, version 1.77.2.6 works by forking a
> > child before checking the real servers via HTTPS. If I remember correctly,
> > this was implemented to prevent a memory leak in the SSL library.
> > Some time ago I proposed a patch [1] to replace this mechanism by plain
> > LWP for HTTP and HTTPS connections. The patch was implemented, as far as
> > I know.
> > [1] http://lists.linux-ha.org/pipermail/linux-ha/2005-July/015176.html
> >The problem might have something to do with the use of LWP for HTTPS
> >instead of the old behaviour by forking a child, as well.
>
> I cannot provide you with an answer because I'm don't know enough about
> ldirectord. What I'm wondering is, why it was changed, when the old code
> worked
> (don't tell me because forking was too heavy)? Also why wasn't something like
> the libcurl used for this? I'm not a seasoned perl-monger, so this could be
> an
> extremely stupid question.
The code was changed because the SSL library that is used seems to have
a memory leak in it somewhere. I have never been able to find it. But by
having a short-lived child process the effects of the leak are negated
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
|