LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: DR Bottlenecks?

To: Jeffrey A Schoolcraft <dream@xxxxxxxxxxxxxx>
Subject: Re: DR Bottlenecks?
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Julian Anastasov <ja@xxxxxx>
Date: Wed, 7 Feb 2001 14:32:03 +0200 (EET)
        Hello,

On Wed, 7 Feb 2001, Jeffrey A Schoolcraft wrote:

> I'm curious if there are any known DR LVS bottlenecks?  My company had the
> opportunity to put LVS to the test the day following the superbowl when we
> delivered 12TB of data in 1 day, and peaked at about 750Mbps.
>
> In doing this we had a couple of problems with LVS (I think they were with
> LVS).  I was using the latest lvs for 2.2.18, and ldiretord to keep the
> machines in and out of LVS.  The LVS servers were running redhat with an
> EEPro100.  I had two clusters, web and video.  The web cluster was a couple of
> 1U's with an acenic gig card, running 2.4.0, thttpd, with a somewhat
> performance tuned system (parts of the C10K).  At peak our LVS got slammed
> with 40K active connections (so said ipvsadmin).  When we reached this number,

        Can you tell us how many packets/sec are received in the LVS
director. I assume you are talking about 750Mbps leaving the real servers?
What shows 'free' in the director? Do you have 128MB + RAM? I'm trying to
understand whether your director receives many packets or creates many
(may be with low incoming traffic) connections?

> or sometime before, LVS became in-accessible.  I could however pull content
> directly from a server, just not through the LVS.  LVS was running on a single
> proc p3, and load never went much above 3% the entire time, I could execute
> tasks on the LVS but http requests weren't getting passed along.

        Hm, did you stopped all ARP advertisements in the real servers?
The known "ARP problem"? ipvsadm does not print the total active
connections. How much are the inactive connections? The total number is
here:

cat /proc/sys/net/ipv4/ip_always_defrag

        If active+inactive=40K this can mean 2,000-3,000 packets/sec
incoming web traffic.

> A similar thing occurred with our video LVS.  While our real servers aren't
> quite capable of handling the C10K, we did about 1500 a peice and maxed out at
> about 150Mbps per machine. I think this is primarily modem users fault.  I
> think we would have pushed more bandwidth to a smaller number of high
> bandwidth users (of course).
>
> I know this volume of traffic choked LVS.  What I'm wondering is, if there is

        What is the volume? I assume the output traffic (from the real
servers) is much more than the input traffic received and forwarded from
the director to the real servers.

> anything I could do to prevent this.  Until we got hit with too many
> connections (mostly modems I imagine) LVS performed superbly.  I wonder if we

        May be more free memory is needed in the director or the
connection table has little hash size?

> could have better performance with a gig card, or some other algorithm (I
> started with wlc, but quickly changed to wrr because all the rr calculations
> should be done initially and never need to be done again unless we change
> weights, I thought this would save us).
>
> Another problem I had was with ldirectord and the test (negotiate, connect).
> It seemed like I needed some type of test to put the servers in initially,
> then too many connections happened so I wanted no test (off), but the servers
> would still drop out from ldirectord.  That's a snowball type problem for my
> amount of traffic, one server gets bumped because it's got too many
> connections, and then the other servers get over-loaded, they'll get dropped
> to, then I'll have an LVS directing to localhost.

        Yes, bad load balancing.

> So, if anyone has pushed DR LVS to the limits and has ideas to share on how to
> maximize it's potential for given hardware, please let me know.
>
> Jeffrey Schoolcraft


Regards

--
Julian Anastasov <ja@xxxxxx>



<Prev in Thread] Current Thread [Next in Thread>