Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time

To:	"LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
From:	JL <lvs@xxxxxxxx>
Date:	Mon, 13 Sep 2010 15:52:29 +0100

OK, More information:

I hope someone who is up on ipvs kernel side is listening!


If a backup machines receives an IPVS state update packet (the ones
sent to 224.0.0.81) with a certain number of connections in it
(somewhere between two and eight, inclusive, will trigger it) then SI
goes to 100% on the backup immediately.

Firewalling 224.0.0.81 insulates you from the problem (although, of
course, is unsuitable for a live deployment).

Feeding in only one connection at a time (slowly enough that the each
have their own IPVS packet) doesn't trigger the problem.

This happens with linux 2.6.35.4, but not 2.6.27.45.


On 13 September 2010 11:40, JL <lvs@xxxxxxxx> wrote:
> On 13 September 2010 03:43, 楷子狐 <higkoo@xxxxxxx> wrote:
>> I had see this problem before :
>>
>>  http://hi.baidu.com/higkoo/blog/item/f8943c60d16843d28cb10d17.html
>>  ------------------
> Looks like the same thing.
>
> I suspect that the LVS service receives updates from the master, and
> then sticks them in some netfilter table, but with some error that
> makes the table huge. Maybe multiple entries appear?
>
> 楷子狐, Are you using MARK firewall rules, or a different method to
> select packets for LVS?
>
> If I change /proc/sys/net/ipv4/vs/sync_threshold to "3 100000", it
> does *not* fix the problem. Which kind of throws any theory I have had
> out the window.
>
> "ipvsadm -l -c" Gives a lot of kernel messages "Detected stall on CPU
> x". Eventually, however we get the list (which is currently only about
> a dozen entries).
>
> It was fine at linux 2.6.27.45.
>
> # /proc/sys/net/ipv4/vs# grep -H "" *
> am_droprate:10
> amemthresh:1024
> cache_bypass:0
> drop_entry:0
> drop_packet:0
> expire_nodest_conn:0
> expire_quiescent_template:0
> nat_icmp_send:0
> secure_tcp:0
> sync_threshold:3        50
>
> Does anyone have an idea what might be happening here?
>
>>  ------------------ Original ------------------
>>  From:  "JL"<lvs@xxxxxxxx>;
>>  Date:  Sun, Sep 12, 2010 06:29 PM
>>  To:  "LinuxVirtualServer.org users mailing 
>> list."<lvs-users@xxxxxxxxxxxxxxxxxxxxxx>;
>>
>>  Subject:  [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
>>
>>
>> Hi,
>>
>> I have recently upgraded from kernel 2.6.27.45 to 2.6.35.4.
>>
>> Now, any machine which is a backup (that is, receiving connection
>> updates from another machine) goes to nearly 100% CPU time in Soft
>> Interrupt.
>>
>> Profiling the kernel shows the largest portion of time is spent in 
>> nf_iterate.
>>
>> We are using FWMARK rules to specify traffic for LVS.
>>
>> Is this problem something people are aware of? Does anyone know of a
>> fix or workaround?
>>
>> Thanks,
>> --
>> Jarrod Lowe
>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>
>
>
>
> --
> Jarrod Lowe
>



-- 
Jarrod Lowe

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread]	Current Thread	[Next in Thread>
[lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, 楷子狐 Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL <= Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Simon Horman Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Graeme Fowler Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Graeme Fowler Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Graeme Fowler Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL

Previous by Date:	Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL
Next by Date:	Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Simon Horman
Previous by Thread:	Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, JL
Next by Thread:	Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time, Simon Horman
Indexes:	[Date] [Thread] [Top] [All Lists]