Re: [lvs-users] Persistent connections not persisting after failover

To: Joseph Mack NA3T <jmack@xxxxxxxx>
Subject: Re: [lvs-users] Persistent connections not persisting after failover
Cc: " users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Simon Horman <horms@xxxxxxxxxxxx>
Date: Wed, 20 Aug 2008 11:28:52 +1000
On Tue, Aug 19, 2008 at 03:19:12PM -0700, Joseph Mack NA3T wrote:
> On Tue, 19 Aug 2008, Nicholas Guarracino wrote:
>>> is there persistence on the backup director?
>> Yes, ipvs is using persistence on both servers. Or do you mean the sync 
>> daemon?
> sorry, I was ambiguous. On the backup director, when it's receiving 
> updates from the syncd on the master director, do you see the persistence 
> flag? Come to think of it, I don't know if the persistence on the backup 
> director comes from the synchd updates or from running ipvsadm when the  
> backup becomes the master :-(
>> Initially I was thinking that no persistence would be needed at all for 
>> the SH scheduler, but it looks like that's not the case since I'm 
>> assuming the list of available realservers could be filled in any 
>> order?
> This was my reservation when I mentioned the -SH scheduler. I don't know 
> what would happen there and I expect noone's tried it.

As long as all of the additions and deletions of real-servers to a
given service occur in the same order, then the hash table for the
-SH scheduler should be consistent.

More precisely, the destinations are stored in a linked list.
Insertions are always made at the end of the link list.
Removals are done in place. The list is never reordered,
other than through the effects of insertions and removals.
It is the order of entries in the link list that determines
what the hash table used like -SH looks like.

So if the list is the same, the hash will be the same.
Otherwise it won't be.

If you are using something like ldirectord to monitor real-servers
and add and remove them (not quiesce them), then its entirely
likely that the order of the list will become inconsistent between
two linux directors over time.

If on the other hand you just run ipvsadm once on boot to set
up the real servers, or your ldirectord-tool quiesces dead real servers,
then the order of the list shouldn't change and should be consistent
between two linux directors.

>> I did see some of the updates you mentioned. They seemed mostly related 
>> to highly loaded systems, and instead of sleeping for a fixed amount of 
>> time, waking up whenever there is data to process. I doubt those 
>> changes would help here since I only have two clients connected, but a 
>> later kernel is definitely worth trying.
> I wasn't real hopeful of anything in there being the cure, but if you 
> could reproduce the problem with a recent kernel, it would save us 
> tracking down a problem that was solved long ago, even if then it wasn't 
> showing the symptoms you've got now.

Just to clarify, you are using persistence as in the -p option to
ipvsadm -A?

If so, the key to persistence working is the persistence template.
It is an entry that goes into the connection table with 0 as the from-port
and acts as a parent entry for other connections from the same
host/netmask - I won't attempt to fail to explain that relationship
yet another time.

You should see these templates being synchronised to the backup
director like any other connection. And you should be able to verify
this using either ipvsadm -Lcn or cat /proc/net/ip_vs_conn

Can you verify that this is the case?
Could you look at the timeouts for the templates?

I suspect that the problem is that the problem is that when
templates are synchronised their timeout is not synchronised.
Instead, at the other end it is set to 3 minutes. So perhaps they
are timing out and disapearing premeturely?

I'm looking at the bottom of ip_vs_process_message() in
net/ipv4/ipvs/ip_vs_sync.c in 2.6.27-rc2

Actually, I think that this works a little differently in 2.6.18,
but the affect seems to be much the same according to a quick
test that I ran.

Clearly this is not ideal, and it would be nice to fix.
But does it explain your problem?

<Prev in Thread] Current Thread [Next in Thread>