On 2010-05-12 10:53, Siim Põder wrote:
> Tomasz Chmielewski wrote:
>> I have "IPVS: ip_vs_send_async error" literally flooding in dmesg.
Jumping in with the same problem here. I see the same error on an
ipvs sync master in a high-volume LVS-DR setup here.
> Can you check with tcpdump if there are connection sync packets
> being sent on the wire and how often/how long they are (host
> 224.0.0.81 and udp port 8848)?
There are around 100 sync packets sent on every update (every second),
and they are all 1420 bytes (except the last one, which varies). Each
bulk is sent within a sub-millisecond window.
> Run netstat -npua (or similar) to see if there are listening sockets
> on 224.0.0.81:8848 or any sockets that connect to 224.0.0.81:8848?
There is:
10.30.174.3:52531 224.0.0.81:8848
(Recv-Q and Send-Q are always listed as 0)
In addition:
* There are 5-7 "ip_vs_send_async error" messages in kern.log every
second.
* The SndbufErrors netstat counter (UDP_MIB_SNDBUFERRORS kernel SNMP
counter) increases similarly to the ip_vs_send_async error messages,
i.e.
netstat -s|grep SndbufErrors
SndbufErrors: 128766
..increases with 5-7 every second.
* The kernel UDP_MIB_SNDBUFERRORS counter is increased at two places
in the kernel code:
net/ipv4/udp.c:udp_push_pending_frames():
if (err == -ENOBUFS && !inet->recverr) {
UDP_INC_STATS_USER(sock_net(sk),
UDP_MIB_SNDBUFERRORS, is_udplite);
err = 0;
}
net/ipv4/udp.c:udp_sendmsg()
if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) {
UDP_INC_STATS_USER(sock_net(sk),
UDP_MIB_SNDBUFERRORS, is_udplite);
}
* "ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space." These
could correlate to various kernel max levels, including:
net.core.rmem_max
net.core.wmem_max
net.core.netdev_max_backlog
net.core.somaxconn
net.ipv4.udp_mem
* The sysctl values have been tuned up tenfold, but without any change
in the symtom; still the same rate of errors in the kernel log.
* The network interface queue length was increased tenfold (ip link
set txqueuelen 10000 dev eth0) without any effect.
* The LVS master has 582940 (=195 MiB) ip_vs_conn objects allocated
(ref slabtop(1)) and around 100Mbps traffic in and out (100 in each
direction). 1Gbps network link. Packets are received and forwarded
on the same interface.
* cat /proc/net/udp shows the multicast connection entry properly,
though it is listed with drops=0.
* The netstat -s metric "outgoing packets dropped" (SNMP name
"OutDiscards", in-kernel name IPSTATS_MIB_OUTDISCARDS) is also
increasing similarly to SndbufErrors. This is a generic error
counter, so it's tricky to determine what causes it in the code.
I'll try to tune various parameters some more to figure out what's
the bottleneck. Hints are welcome.
s.
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|