On 27 May 2004, at 15:35, lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
wrote:
On Thu, May 27, 2004 at 01:30:24PM +0900, Horms wrote:
On Thu, May 27, 2004 at 12:56:52PM +0900, Horms wrote:
On Thu, May 27, 2004 at 11:47:03AM +1000, Jonathan Trott wrote:
[snip]
Are there any known problems with 1.87 I should be wary of? Is
there
another version with the fallback fix but without this new bug? Can
anyone supply a patch so that I can fix this problem in 1.87?
Unfortunately I don't have the time to come to grips with the
ldirectord perl code :(
Thanks,
I have tried other versions of the code in CVS and the last one that
worked for me was 1.86. After that I don't get all my pools
containing
real servers. I will test 1.86 and see if the fallback problem has
been
solved.
Who do I report bugs to regarding the 1.87 problem? BTW, the latest
stable, 1.77.2.2 has the same problem as 1.87.
[snip]
Hi,
I think that I have a fix for this. Basically the internal state
kept for a real server, to determin if it is up or down was a bit
to simplistic. This has been the root of a number of problems over
the past while. The patch below introduces more sophisticated tracking
of the internal state of real servers. This should resolve your
problem.
And hopefully not reintroduce other related problems that were resolve
in the past.
Feedback is more than welcome.
I tested this morning with 1.88 and it loads all the real servers fine
now, thank you. Additionally we tested the fallback operation and that
is also working fine. I did come across another problem when migrating
the configuration from virtual ip based to fwmark based. I changed my
configuration file removing the two virtual ip based LVS from it and
replacing them with a single fwmark service for ports 80 and 443. When
ldirectord reloaded the new configuration it added the fwmark service
but didn't successfully remove the old virtual ip services. Here is the
relevant bit from /var/log/messages:
May 28 11:49:46 osacon2 ldirectord[24451]: Configuration file
'/etc/ha.d/conf/ldirectord.cf' has changed on disk
May 28 11:49:46 osacon2 ldirectord[24451]: - reread new configuration
May 28 11:49:46 osacon2 ldirectord[24451]: Error [] reading file
/etc/ha.d/conf/ldirectord.cf at line 46: protocol must be fwm if the
virtual service is a fwmark (a number)
May 28 11:49:46 osacon2 ldirectord[24451]: system(/sbin/ipvsadm -D
192.168.100.1:80) failed
May 28 11:49:46 osacon2 ldirectord[24451]: Removed virtual server:
192.168.100.1:80
May 28 11:49:46 osacon2 ldirectord[24451]: system(/sbin/ipvsadm -D 1)
failed
May 28 11:49:46 osacon2 ldirectord[24451]: Removed virtual server: 1
May 28 11:49:46 osacon2 ldirectord[24451]: Linux Director Daemon
terminated on signal: Died at /etc/ha.d/resource.d/ldirectord line
1239, <CFGFILE> line 46.
After I fixed the config file and manually removed the old virtual ip
service everything was fine.
I can see why it failed, the syntax for the ipvsadm command is
incorrect. According to the help on my version of ipvsadm (ipvsadm
v1.21 2002/11/12 (compiled with popt and IPVS v1.0.10)) the syntax
shoudl be /sbin/ipvsadm -D -t 192.168.100.1:80, which did work
correctly for me. I'm assuming that the delete command is built from
syntax in the configuration file, which wasn't there, hence the extra
space in the command and the lack of -t. Mind you, the syntax to remove
the fwmark virtual service failed also, and that was in the file.
Normal add and delete of real servers has been working fine.
[snip]
--
Horms
<snip>
Thanks,
JT
|