LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: LVS performance bug

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: LVS performance bug
From: Graeme Fowler <graeme@xxxxxxxxxxx>
Date: Wed, 14 Mar 2007 15:29:48 +0000
On Wed, 2007-03-14 at 07:41 -0500, Rudd, Michael wrote:
> Ran it again and here's what I see. We currently have 8 Gigs of memory
> installed. It doesn't appear from the "free" command that we ran
> completely out of memory. Heres what "free" and "slabinfo" say before
> the test. This is sitting idle. 
<snip>
> Then we ran the test again and cranked up the traffic. It took a few
> minutes and then it happened again. 
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
> IPVS: ip_vs_conn_new: no memory available.
<snip>
> So the only thing I see shooting up higher in memory used is
> buffers/cache used seems to grow. But in the slabinfo the ip_vs_conn
> active objects grows fast. I watched it grow during the test from 39K
> objects to over 2 million objects. Maybe something isn't being reset or
> returned to the pool. We are running the OPS patch(one packet
> scheduling) because we are using LVS for the udp service DNS. I'm sure
> it treats connections differently than the regularly hashed connections
> thing. 

Aside from OPS, is this a relatively stock kernel (or a distributed
one); ie. not custom compiled by you? I'm going to have a pitch at
something slightly out of my normal range here... I'm wondering if the
conns/sec * conn time in sec is greater than the default connection
table size - the error you see is from ip_vs_conn.c:

  cp = kmem_cache_alloc(ip_vs_conn_cachep, GFP_ATOMIC);
  if (cp == NULL) {
          IP_VS_ERR_RL("ip_vs_conn_new: no memory available.\n");
          return NULL;
  }

in turn, ip_vs_conn_cachep is defined inside ip_vs_conn_init:

  /*
   * Allocate the connection hash table and initialize its list heads
   */
  ip_vs_conn_tab = vmalloc(IP_VS_CONN_TAB_SIZE*sizeof(struct
list_head));
  if (!ip_vs_conn_tab)
          return -ENOMEM;

  /* Allocate ip_vs_conn slab cache */
  ip_vs_conn_cachep = kmem_cache_create("ip_vs_conn",
                                        sizeof(struct ip_vs_conn), 0,
                                        SLAB_HWCACHE_ALIGN, NULL, NULL);
  if (!ip_vs_conn_cachep) {
          vfree(ip_vs_conn_tab);
          return -ENOMEM;
  }

The ip_vs_conn_tab is therefore sized according to IP_VS_CONN_TAB_SIZE,
which is set in the compile process and defaults to 12. This gives a
table size of 4096 (2^12). If you hit your server with a *very* high
connection rate (as in busy DNS) then you're going to exhaust your
connection table in no time, especially if the DNS servers take a little
longer to respond when loaded (in this case you get a variant of the
"thundering herd" problem; namely that when responses start to take
longer, you get more requests).

I'd try recompiling with IP_VS_TAB_BITS set to something higher.

I'd also try not using OPS, to see whether or not that in itself is the
problem *or* if the more straightforward schedulers exhaust the
connection table before OPS does.

Graeme


<Prev in Thread] Current Thread [Next in Thread>