LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Ldirectord does not load all real servers

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: Ldirectord does not load all real servers
Cc: horms@xxxxxxxxxxxx
From: Jonathan Trott <jtrott@xxxxxxxxxxx>
Date: Fri, 28 May 2004 12:11:13 +1000
On 27 May 2004, at 15:35, lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx wrote:
On Thu, May 27, 2004 at 01:30:24PM +0900, Horms wrote:
On Thu, May 27, 2004 at 12:56:52PM +0900, Horms wrote:
On Thu, May 27, 2004 at 11:47:03AM +1000, Jonathan Trott wrote:

[snip]

Are there any known problems with 1.87 I should be wary of? Is there
another version with the fallback fix but without this new bug? Can
anyone supply a patch so that I can fix this problem in 1.87?
Unfortunately I don't have the time to come to grips with the
ldirectord perl code :(
Thanks,

I have tried other versions of the code in CVS and the last one that
worked for me was 1.86. After that I don't get all my pools containing real servers. I will test 1.86 and see if the fallback problem has been
solved.
Who do I report bugs to regarding the 1.87 problem? BTW, the latest
stable, 1.77.2.2 has the same problem as 1.87.

[snip]

Hi,

I think that I have a fix for this. Basically the internal state
kept for a real server, to determin if it is up or down was a bit
to simplistic. This has been the root of a number of problems over
the past while. The patch below introduces more sophisticated tracking
of the internal state of real servers. This should resolve your problem.
And hopefully not reintroduce other related problems that were resolve
in the past.

Feedback is more than welcome.

I tested this morning with 1.88 and it loads all the real servers fine now, thank you. Additionally we tested the fallback operation and that is also working fine. I did come across another problem when migrating the configuration from virtual ip based to fwmark based. I changed my configuration file removing the two virtual ip based LVS from it and replacing them with a single fwmark service for ports 80 and 443. When ldirectord reloaded the new configuration it added the fwmark service but didn't successfully remove the old virtual ip services. Here is the relevant bit from /var/log/messages:

May 28 11:49:46 osacon2 ldirectord[24451]: Configuration file '/etc/ha.d/conf/ldirectord.cf' has changed on disk
May 28 11:49:46 osacon2 ldirectord[24451]:  - reread new configuration
May 28 11:49:46 osacon2 ldirectord[24451]: Error [] reading file /etc/ha.d/conf/ldirectord.cf at line 46: protocol must be fwm if the virtual service is a fwmark (a number) May 28 11:49:46 osacon2 ldirectord[24451]: system(/sbin/ipvsadm -D 192.168.100.1:80) failed May 28 11:49:46 osacon2 ldirectord[24451]: Removed virtual server: 192.168.100.1:80 May 28 11:49:46 osacon2 ldirectord[24451]: system(/sbin/ipvsadm -D 1) failed
May 28 11:49:46 osacon2 ldirectord[24451]: Removed virtual server: 1
May 28 11:49:46 osacon2 ldirectord[24451]: Linux Director Daemon terminated on signal: Died at /etc/ha.d/resource.d/ldirectord line 1239, <CFGFILE> line 46.

After I fixed the config file and manually removed the old virtual ip service everything was fine. I can see why it failed, the syntax for the ipvsadm command is incorrect. According to the help on my version of ipvsadm (ipvsadm v1.21 2002/11/12 (compiled with popt and IPVS v1.0.10)) the syntax shoudl be /sbin/ipvsadm -D -t 192.168.100.1:80, which did work correctly for me. I'm assuming that the delete command is built from syntax in the configuration file, which wasn't there, hence the extra space in the command and the lack of -t. Mind you, the syntax to remove the fwmark virtual service failed also, and that was in the file. Normal add and delete of real servers has been working fine.

[snip]


--
Horms
<snip>
Thanks,
JT

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Ldirectord does not load all real servers, Jonathan Trott <=