LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] Sync daemon and many concurrent connections

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] Sync daemon and many concurrent connections
From: Sashi Kant <sashi.kant@xxxxxxxxxxxxx>
Date: Thu, 8 Oct 2009 09:23:40 -0700
Hello Siim,

We had similar packet/second rate in our environment and had issues of  
dropped packets during floods of requests coming. We also have very  
similar traffic pattern (small packets and lots of them). Our client  
connections are not long running.

We start seeing issues with broadcom cards ~ 40 k PPS
swapping them with intel helped us reach ~ 80 k PPS

Ultimately we had to replace intel (e1000) type card with newer intel  
cards with MSI-X capabilities, this card uses igb driver which needed  
to be compiled for debian etch and lenny systems.

During our test we could reach upto 600 K PPS using these cards  
without hogging the interrupts.

Hope this helps

-Sashi

On Oct 8, 2009, at 8:31 AM, Siim Põder wrote:

> Hi
>
> We are planning to use LVS for a setup with a lot of (millions)
> concurrent (mostly idle) connections and were setting up sync daemon  
> to
> avoid a reconnect flood when the master fails.
>
> Originally I was planning to ask for help, but it turned out to be one
> of those cases where you go over the problem description and refine  
> the
> details until the problem description ceases to exist. So, instead  
> I'll
> post the results and what we needed tuning to get it working.
>
> Short summary: sync daemon is working very well with high connection
> rate if you increase rmem_default and wmem_default sysctls.
>
> Initially, there was a problem with sync_master daemon sending  
> updates.
> As it just sent updates every second, the send buffer of the socket  
> got
> full and we got ip_vs_sync_send_async errors in kernel log. We  
> decreased
> the sleep time to 100ms which gave slightly better results, but
> net.core.wmem_max and net.core.wmem_default also needed increasing
> (which probably means, that we could have left the kernel unchanged).
>
> After that we had problems on the sync_backup daemon size, whose  
> receive
> buffer now got full from time to time and resulted in lost sync  
> packets
> (visible through udp receive errors). So we also increased the rmem
> sysctls quite a bit, which solved that problem as well.
>
> Another consideration for mostly idle connections seems to be choosing
> appropriate sync_threshold and tcp timeout (ipvsadm -L --timeout)
> values. Our current plan is to increase the tcp timeout to 30 minutes
> (1800) and reduce sync_threshold to (3 10) so that the connections  
> would
> stay actual on the backup even with relatively infrequent keepalives
> being sent.
>
> Hardware for testing was a few of 2xquad opterons with 16GB memory,  
> dual
> e1000 and onboard dual bnx network cards, sync_threshold = 0 1 (sync  
> on
> every packet, for testing), using LVS-NAT. Set up and run by a very
> diligent coworker :)
>
> Some results:
> 8.5 million connections all synced
> ~100Kpackets/s of keepalives on external interface
> 900 packets/s of sync daemon traffic
> just over 100Mbps of traffic (short packets)
>
> On primary LVS, ~1% of 1 core for sync_master daemon, 1 core 10-40% in
> softirq (ipvs?), ~1.7GB of memory used in total
> On secondary LVS, ~10% of 1 core for sync_backup daemon, 1 core 20% in
> softirq (ipvs?), ~1.7GB of memory used in total
>
> Failover with keepalived worked as expected once all connections were
> established.
>
> The likely limiting factor seems to be the 1 core 40% in softirq. This
> was also the core which serviced the bnx network card so it's possible
> that switching entirely to e1000 would leviate the problem (the core
> responsible for e1000 was ~10% in softirq). Also, time spent in  
> softirq
> was not really consistent and sometimes dropped quite low (maybe an
> altogether different problem).
>
> Interrupt load was low (8K/s in total) with both e1000 and bnx cards  
> in
> use, although we still superstitiously suspect broadcom is not quite  
> as
> scalable as intel.
>
> Siim
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>