LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] "IPVS: ip_vs_send_async error" flood

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] "IPVS: ip_vs_send_async error" flood
From: Seventh Sven <svensven@xxxxxxxxx>
Date: Wed, 12 May 2010 13:47:16 +0200
On 2010-05-12 10:53, Siim Põder wrote:
> Tomasz Chmielewski wrote:
>> I have "IPVS: ip_vs_send_async error" literally flooding in dmesg.

Jumping in with the same problem here. I see the same error on an
ipvs sync master in a high-volume LVS-DR setup here.

> Can you check with tcpdump if there are connection sync packets
> being sent on the wire and how often/how long they are (host
> 224.0.0.81 and udp port 8848)?

There are around 100 sync packets sent on every update (every second),
and they are all 1420 bytes (except the last one, which varies). Each
bulk is sent within a sub-millisecond window.

> Run netstat -npua (or similar) to see if there are listening sockets
> on 224.0.0.81:8848 or any sockets that connect to 224.0.0.81:8848?

There is:

   10.30.174.3:52531       224.0.0.81:8848
   (Recv-Q and Send-Q are always listed as 0)

In addition:

* There are 5-7 "ip_vs_send_async error" messages in kern.log every
   second.

* The SndbufErrors netstat counter (UDP_MIB_SNDBUFERRORS kernel SNMP
   counter) increases similarly to the ip_vs_send_async error messages,
   i.e.

   netstat -s|grep SndbufErrors
   SndbufErrors: 128766

   ..increases with 5-7 every second.

* The kernel UDP_MIB_SNDBUFERRORS counter is increased at two places
   in the kernel code:

   net/ipv4/udp.c:udp_push_pending_frames():
   if (err == -ENOBUFS && !inet->recverr) {
     UDP_INC_STATS_USER(sock_net(sk),
         UDP_MIB_SNDBUFERRORS, is_udplite);
     err = 0;
   }

   net/ipv4/udp.c:udp_sendmsg()
   if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) {
     UDP_INC_STATS_USER(sock_net(sk),
         UDP_MIB_SNDBUFERRORS, is_udplite);
   }

* "ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space." These
   could correlate to various kernel max levels, including:

     net.core.rmem_max
     net.core.wmem_max
     net.core.netdev_max_backlog
     net.core.somaxconn
     net.ipv4.udp_mem

* The sysctl values have been tuned up tenfold, but without any change
   in the symtom; still the same rate of errors in the kernel log.

* The network interface queue length was increased tenfold (ip link
   set txqueuelen 10000 dev eth0) without any effect.

* The LVS master has 582940 (=195 MiB) ip_vs_conn objects allocated
   (ref slabtop(1)) and around 100Mbps traffic in and out (100 in each
   direction). 1Gbps network link. Packets are received and forwarded
   on the same interface.

* cat /proc/net/udp shows the multicast connection entry properly,
   though it is listed with drops=0.

* The netstat -s metric "outgoing packets dropped" (SNMP name
   "OutDiscards", in-kernel name IPSTATS_MIB_OUTDISCARDS) is also
   increasing similarly to SndbufErrors. This is a generic error
   counter, so it's tricky to determine what causes it in the code.

I'll try to tune various parameters some more to figure out what's
the bottleneck.  Hints are welcome.

s.

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>