LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
From: JL <lvs@xxxxxxxx>
Date: Mon, 13 Sep 2010 22:13:07 +0100
On 13 September 2010 21:44, Graeme Fowler <graeme@xxxxxxxxxxx> wrote:
> On Mon, 2010-09-13 at 20:45 +0100, JL wrote:
>> > Is this a two-node setup where the directors are also realservers? Just
>> > trying to get the architecture clear in my head.
>> Yes, that's right.
>
> OK.
>
> Remove the LVS rules from the backup director altogether, please. Just
> humour me on this one :)
Can't do that right now; but I'll give it a go tomorrow. However...

> Actually, if you don't want to humour me, have a read back in the list
> archives for similar scenarios. What can happen once connection sync is
> in use is that a packet can "ping-pong" between the directors in the
> following way - the timing here is slightly out of order for reasons
> which should be obvious:
>
> 1. SYN for VIP arrives at master
> 2. Master looks up SYN in connection table; no match
> 3. Master sends SYN onto realserver 2 (which is a backup director)
> 4. Backup director receives SYN, looks up connection table; no match
> 5. Backup director sends SYN onto realserver 1 (master director)
> 6. Master director receives SYN, looks up in table, matches realserver 2
> 7. Master director sends packet to realserver 2
> 8. Backup director receives SYN, looks up connection table; matches
> realserver 1
> 9. Backup sends packet to realserver 1
> 10. Goto 6.
>
> With ever-faster networking this is likely to kill interrupt processing
> before it saturates the network.
...I am familiar with the scenario you are describing - where each
machine decides the other should handle the connection - but it is not
the situation that is occurring at the moment. This problem didn't
happen with the exact same setup under 2.6.27.45, and...

> I've used fwmark handling to ensure that packets coming from the "other"
> machine do not get handled by LVS at all, which resolves the problem.
... I am using fwmark, for exactly the reason you describe. The fwmark
rules capture traffic on one interface, the handoff rule from master
to backup sends the packet to the other interface.

> This might not be what you're seeing, but it could be. Better to check.
The other thing I did, to make testing simpler, was to remove the
master from the list of realservers on both machines. In this way, I
didn't have to worry about my test systems being balanced to the
master, which doesn't show the issue.

The 100% SI remains, even if there are no balance-able packets coming
into the system.

>
> Graeme
Thanks, but I don't think we have cracked it yet. (However, for
thoroughness, I will remove the fwmark rules on the backup, as you
suggested, and report back.)



-- 
Jarrod Lowe

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>