On Tue, Aug 19, 2008 at 1:56 PM, Joseph Mack NA3T <jmack@xxxxxxxx> wrote:
> On Tue, 19 Aug 2008, Nicholas Guarracino wrote:
>
>> I have a cluster set up for load balancing a web based application
>> that requires persistent connections. I'm using the ipvs sync daemon
>> to keep the connection state information consistent between director
>> and backup director. However, many times after a failover, the
>> persistence does not work and clients end up connected to a different
>> realserver. I'm running this on kernel 2.6.18 (RHEL 5, so not exactly
>> bleeding edge.)
>
> thanks for the nice description
Sure, thanks for the reply!
> .
> .
>
>> So after the failover, the clients' connections have been reversed:
>> 159.63.77.30 is now connected to 10.204.54.166
>> 159.63.77.44 is now connected to 10.204.54.167
>>
>> If I run ipvsadm -L on the backup director before the failover, I do
>> see the proper connections so I know the multicast is getting through.
>
> is there persistence on the backup director?
Yes, ipvs is using persistence on both servers. Or do you mean the
sync daemon? The daemon is running on both the director and the backup
director. I tried using heartbeat's LVSSyncDaemonSwap to have only the
master running on the director and the slave running on the backup
director, and also having both master and slave running on both
machines. Same result with both setups.
>> Any thoughts on what I might be doing wrong?
>
> Possibly nothing. It may be a bug. persistence has a lot of
> problems. I personally suggest using the -SH scheduler for
> your type of setup, but it hasn't been extensively tested
> under the synch demon, so I don't know what will happen
Initially I was thinking that no persistence would be needed at all
for the SH scheduler, but it looks like that's not the case since I'm
assuming the list of available realservers could be filled in any
order? So the realserver at index 1 on the director may not be the
same as the realserver at index 1 in the backup director. Could be
wrong about that though.
> there. There has been updates (not much) to the synch demon
> since your kernel (and I don't know if they address your
> problem). I can put the ball back in your court by asking
> you to try a recent kernel. I remember something about
> persistence and the synchd recently, but I looked back in my
> list of open problems and didn't see it, so I don't know
> what it might have been.
I did see some of the updates you mentioned. They seemed mostly
related to highly loaded systems, and instead of sleeping for a fixed
amount of time, waking up whenever there is data to process. I doubt
those changes would help here since I only have two clients connected,
but a later kernel is definitely worth trying.
Thanks again,
--Nick
|