LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Ldirectord - unexpected crashes

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Ldirectord - unexpected crashes
From: Horms <horms@xxxxxxxxxxxx>
Date: Mon, 13 Nov 2006 10:41:29 +0900
On Fri, Nov 10, 2006 at 07:30:20PM +0100, Roberto Nibali wrote:
> >Just migrated another customer environment from some older
> >Ldirectord 1.77.2.6 to a more recent version 1.77.2.45.

There have been a number of updates since 1.77.2.45. 

Could you please try 1.186, which is available at
http://www.vergenet.net/linux/ldirectord/download/ldirectord.1.186

I have also put an updated explanation
of how to get the latest ldirectord (which is 1.186 today),
at http://www.vergenet.net/linux/ldirectord/download.shtml

> 
> I've looked at the code for 5 minutes, simply to see what we're talking about 
> ... crazy coding :).
> 
> >I have a problem with the more recent Ldirectord 1.77.2.45. It crashes
> >unexpected without any message in the log. Sometimes this happens after a
> >few hours and sometimes a few weeks (2-3) after it has been started. Up to
> >now, I could not figure out the problem.
> 
> What kind of crash is it?
> 
> >But I assume it must have something to do with the high real server
> >fluctuation within the environment. That means the real servers are taken
> >out of the load balancing because of hanging processes on the real servers
> >very often. After restart of these processes on the real servers,
> >ldirectord spotlessly put them back into the virtual server. What I can
> >see is, that ldirectord seems to crash, when there is some kind of a peak
> >in the fluctuation.
> >History:
> >  To check/monitor HTTPS connections, version 1.77.2.6 works by forking a
> >  child before checking the real servers via HTTPS. If I remember correctly,
> >  this was implemented to prevent a memory leak in the SSL library.
> >  Some time ago I proposed a patch [1] to replace this mechanism by plain
> >  LWP for HTTP and HTTPS connections. The patch was implemented, as far as
> >  I know.
> >    [1] http://lists.linux-ha.org/pipermail/linux-ha/2005-July/015176.html
> >The problem might have something to do with the use of LWP for HTTPS
> >instead of the old behaviour by forking a child, as well.
> 
> I cannot provide you with an answer because I'm don't know enough about 
> ldirectord. What I'm wondering is, why it was changed, when the old code 
> worked 
> (don't tell me because forking was too heavy)? Also why wasn't something like 
> the libcurl used for this? I'm not a seasoned perl-monger, so this could be 
> an 
> extremely stupid question.

The code was changed because the SSL library that is used seems to have
a memory leak in it somewhere. I have never been able to find it. But by
having a short-lived child process the effects of the leak are negated

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/


<Prev in Thread] Current Thread [Next in Thread>