OK, I have a tcpdump of some LVS packets. Immediately after receiving
this, the backup goes to 100% S.I.
A few notes that may help:
172.26.64.76 is the external machine I am testing from
192.168.148.2 is the VIP
10.17.192.19 is the private-side address of master
10.17.192.20 is the private-side address of backup
I notice that the packet dump contains multiple references to the
*same* connection. Is that normal?
This problem doesn't happen with HTTP with this small number of
connections. I suspect that may be because my HTTP tests have a lot
less packets per connection.
I triggered this with four simultaneous connections - but only on my
On 13 September 2010 16:47, JL <lvs@xxxxxxxx> wrote:
> On 13 September 2010 16:25, Simon Horman <horms@xxxxxxxxxxxx> wrote:
>> On Mon, Sep 13, 2010 at 03:52:29PM +0100, JL wrote:
>>> OK, More information:
>>> I hope someone who is up on ipvs kernel side is listening!
>> I am listening, sorry for not responding earlier.
> Thanks. I was getting nervous :)
>>> If a backup machines receives an IPVS state update packet (the ones
>>> sent to 18.104.22.168) with a certain number of connections in it
>>> (somewhere between two and eight, inclusive, will trigger it) then SI
>>> goes to 100% on the backup immediately.
>>> Firewalling 22.214.171.124 insulates you from the problem (although, of
>>> course, is unsuitable for a live deployment).
>> Presumably turning off connection synchronisation
>> has the same effect.
>>> Feeding in only one connection at a time (slowly enough that the each
>>> have their own IPVS packet) doesn't trigger the problem.
>> So it occurs if the number of synchronised connections in
>> a single packet is between 2 and 8. So 1 is ok, and so is 9?
> No, one is ok, but somewhere between 2 and 8 this problem begins, and
> anything higher has the problem. I just haven't been able to narrow
> down the number any tighter than that.
> However, some more testing indicates that it is not that straight-forward.
> If I trigger it by pressing reload in the browser (which kicks off
> about 9 HTTPS connections) I get the problem - If I put those same
> gets into a bash script, and get them all at once, then it doesn't.
> I'm still trying to simplify the problem down to a simple script I can run.
> This is a two-node LVS/RS system. I have found that if none the
> connections in the state packet are to backup machine, then it doesn't
> trigger this problem.
>>> This happens with linux 126.96.36.199, but not 188.8.131.52.
>> That is a fairly wide number of kernel versions.
>> But if it is easy to reproduce then it should be fairly easy to track down.
> That was the intent of coming up with a simple test - that I could try
> a number of different kernel versions, and see where the problem
> Investigation is ongoing...
> Jarrod Lowe
Description: Text document
Please read the documentation before posting - it's available at:
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users