On Tue, Aug 19, 2008 at 9:28 PM, Simon Horman <horms@xxxxxxxxxxxx> wrote:
> As long as all of the additions and deletions of real-servers to a
> given service occur in the same order, then the hash table for the
> -SH scheduler should be consistent.
>
> More precisely, the destinations are stored in a linked list.
> Insertions are always made at the end of the link list.
> Removals are done in place. The list is never reordered,
> other than through the effects of insertions and removals.
> It is the order of entries in the link list that determines
> what the hash table used like -SH looks like.
>
> So if the list is the same, the hash will be the same.
> Otherwise it won't be.
>
> If you are using something like ldirectord to monitor real-servers
> and add and remove them (not quiesce them), then its entirely
> likely that the order of the list will become inconsistent between
> two linux directors over time.
Yup, that is my exact setup.
> If on the other hand you just run ipvsadm once on boot to set
> up the real servers, or your ldirectord-tool quiesces dead real servers,
> then the order of the list shouldn't change and should be consistent
> between two linux directors.
>
> Just to clarify, you are using persistence as in the -p option to
> ipvsadm -A?
Yes (I am using ldirectord)
> If so, the key to persistence working is the persistence template.
> It is an entry that goes into the connection table with 0 as the from-port
> and acts as a parent entry for other connections from the same
> host/netmask - I won't attempt to fail to explain that relationship
> yet another time.
>
> You should see these templates being synchronised to the backup
> director like any other connection. And you should be able to verify
> this using either ipvsadm -Lcn or cat /proc/net/ip_vs_conn
>
> Can you verify that this is the case?
> Could you look at the timeouts for the templates?
Yes, the templates are being synchronized.
On the director:
TCP 00:51 NONE 159.63.77.32:0 10.204.54.170:443 10.204.54.165:443
TCP 00:54 NONE 159.63.77.16:0 10.204.54.170:443 10.204.54.166:443
On the backup director:
TCP 02:51 NONE 159.63.77.32:0 10.204.54.170:443 10.204.54.165:443
TCP 02:54 NONE 159.63.77.16:0 10.204.54.170:443 10.204.54.166:443
> I suspect that the problem is that the problem is that when
> templates are synchronised their timeout is not synchronised.
> Instead, at the other end it is set to 3 minutes. So perhaps they
> are timing out and disapearing premeturely?
>
> I'm looking at the bottom of ip_vs_process_message() in
> net/ipv4/ipvs/ip_vs_sync.c in 2.6.27-rc2
>
> Actually, I think that this works a little differently in 2.6.18,
> but the affect seems to be much the same according to a quick
> test that I ran.
>
> Clearly this is not ideal, and it would be nice to fix.
> But does it explain your problem?
I'm not sure. My timeout is only 1 minute but it does look like when
the template gets synchronized to the backup director the timeout
becomes 3 minutes. That would be the opposite situation from what you
described though, wouldn't it?
--Nick
|