LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] 回复: Kernel 2.6.35 and 100% S.I. CPU Time

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] 回复: Kernel 2.6.35 and 100% S.I. CPU Time
From: JL <lvs@xxxxxxxx>
Date: Tue, 14 Sep 2010 10:04:49 +0100
I suspect we may actually be looking at different problems.

It sounds to me like you are actually using all your CPU to
legitimately process the packets. That is, you have more traffic than
one CPU can handle.

The problem I am having happens even with minimal network traffic.
And, on my live systems, has 100% SI on 16 cores - the interrupts are
already properly distributed across the cores.


On 14 September 2010 02:01, 楷子狐 <higkoo@xxxxxxx> wrote:
> I'm sorry , my english is pool.
>
>
> I want to talk to you all , but I'm so said.
>
>
> The problem which lvs cost 100% cpu on si% , the solution I know is :
>   use multi network card , and make them bond instead .
>
>
> Because, the linux kernel is not support banlance si% to multi cpu core , so 
> the problem come up.
>
>
> see this :
>
>
>
>     ${linux-2.6.34}/Documentation/networking/cxgb.txt
>
>
> These  issues have been identified during testing. The following information  
> is provided as a workaround to the problem. In some cases, this problem  is 
> inherent to Linux or to a particular Linux Distribution and/or  hardware 
> platform.
>
>
>
> 1. Large number of TCP retransmits on a multiprocessor (SMP) system.
>
>
>
>       On  a system with multiple CPUs, the interrupt (IRQ) for the network  
> controller may be bound to more than one CPU. This will cause TCP  
> retransmits if the packet data were to be split across different CPUs  and 
> re-assembled in a different order than expected.
>
>
>
>        To eliminate the TCP retransmits, set smp_affinity on the  particular  
>      interrupt to a single CPU. You can locate the interrupt  (IRQ) used on   
>     the N110/N210 by using ifconfig:           ifconfig  <dev_name> | grep 
> Interrupt       Set the smp_affinity to a  single CPU:
>
>           echo 1 > /proc/irq/<interrupt_number>/smp_affinity
>
>
>
>      It  is highly suggested that you do not run the irqbalance daemon on 
> your  system, as this will change any smp_affinity setting you have applied.  
> The irqbalance daemon runs on a 10 second interval and binds interrupts       
>  to the least loaded CPU determined by the daemon. To disable this daemon:
>
>           chkconfig --level 2345 irqbalance off
>
>
>
>        By default, some Linux distributions enable the kernel feature,  
> irqbalance, which performs the same function as the daemon. To  disable     
> this feature, add the following line to your bootloader:
>
>           noirqbalance
>
>
>
>           Example using the Grub bootloader:
>
>
>
>               title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
>
>               root (hd0,0)
>
>               kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
>
>               initrd /initrd-2.4.21-27.ELsmp.img
>
>
>
> I'm writting in 
> http://hi.baidu.com/higkoo/blog/item/c9b561e943744932b90e2d4c.html .
>
>
>
>
>   I this bonding multi network card or chang your os ,is the only solution .
>
>
>
>
>  Thanks All.
>
> ------------------
> 还好有时光机我谢谢你!
>
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "JL"<lvs@xxxxxxxx>;
> 发送时间: 2010年9月13日(星期一) 晚上6:40
> 收件人: "LinuxVirtualServer.org users mailing 
> list."<lvs-users@xxxxxxxxxxxxxxxxxxxxxx>; "楷子狐"<higkoo@xxxxxxx>;
>
> 主题: Re: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
>
>
> On 13 September 2010 03:43, 楷子狐 <higkoo@xxxxxxx> wrote:
>> I had see this problem before :
>>
>> http://hi.baidu.com/higkoo/blog/item/f8943c60d16843d28cb10d17.html
>> ------------------
> Looks like the same thing.
>
> I suspect that the LVS service receives updates from the master, and
> then sticks them in some netfilter table, but with some error that
> makes the table huge. Maybe multiple entries appear?
>
> 楷子狐, Are you using MARK firewall rules, or a different method to
> select packets for LVS?
>
> If I change /proc/sys/net/ipv4/vs/sync_threshold to "3 100000", it
> does *not* fix the problem. Which kind of throws any theory I have had
> out the window.
>
> "ipvsadm -l -c" Gives a lot of kernel messages "Detected stall on CPU
> x". Eventually, however we get the list (which is currently only about
> a dozen entries).
>
> It was fine at linux 2.6.27.45.
>
> # /proc/sys/net/ipv4/vs# grep -H "" *
> am_droprate:10
> amemthresh:1024
> cache_bypass:0
> drop_entry:0
> drop_packet:0
> expire_nodest_conn:0
> expire_quiescent_template:0
> nat_icmp_send:0
> secure_tcp:0
> sync_threshold:3        50
>
> Does anyone have an idea what might be happening here?
>
>> ------------------ Original ------------------
>> From: "JL"<lvs@xxxxxxxx>;
>> Date: Sun, Sep 12, 2010 06:29 PM
>> To: "LinuxVirtualServer.org users mailing 
>> list."<lvs-users@xxxxxxxxxxxxxxxxxxxxxx>;
>>
>> Subject: [lvs-users] Kernel 2.6.35 and 100% S.I. CPU Time
>>
>>
>> Hi,
>>
>> I have recently upgraded from kernel 2.6.27.45 to 2.6.35.4.
>>
>> Now, any machine which is a backup (that is, receiving connection
>> updates from another machine) goes to nearly 100% CPU time in Soft
>> Interrupt.
>>
>> Profiling the kernel shows the largest portion of time is spent in 
>> nf_iterate.
>>
>> We are using FWMARK rules to specify traffic for LVS.
>>
>> Is this problem something people are aware of? Does anyone know of a
>> fix or workaround?
>>
>> Thanks,
>> --
>> Jarrod Lowe
>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>
>
>
>
> --
> Jarrod Lowe
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users



-- 
Jarrod Lowe

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
<Prev in Thread] Current Thread [Next in Thread>