[PATCH net-next 00/12] ipvs: Add icmp scheduling

To: <ja@xxxxxx>, <horms@xxxxxxxxxxxx>, <lvs-devel@xxxxxxxxxxxxxxx>
Subject: [PATCH net-next 00/12] ipvs: Add icmp scheduling
Cc: <kernel-team@xxxxxx>, <alexgartrell@xxxxxxxxx>, Alex Gartrell <agartrell@xxxxxx>
From: Alex Gartrell <agartrell@xxxxxx>
Date: Wed, 12 Aug 2015 13:46:58 -0700
The configuration of ipvs at Facebook is relatively straightforward.  All
ipvs instances bgp advertise a set of VIPs and the network prefers the
nearest one or uses ECMP in the event of a tie.  For the uninitiated, ECMP
deterministically and statelessly load balances by hashing the packet
(usually a 5-tuple of protocol, saddr, daddr, sport, and dport) and using
that number as an index (basic hash table type logic).

The problem is that ICMP packets (which contain really important
information like whether or not an MTU has been exceeded) will get a
different hash value and may end up at a different ipvs instance.  With no
information about where to route these packets, they are dropped, creating
ICMP black holes and breaking Path MTU discovery.  Suddenly, my mom's
pictures can't load and I'm fielding midday calls that I want nothing to do

To address this, this patch set introduces the ability to schedule icmp
packets which is gated by a sysctl net.ipv4.vs.schedule_icmp.  If set to 0,
the old behavior is maintained -- otherwise ICMP packets are scheduled.

Alex Gartrell (12):
  ipvs: pull out ip_vs_try_to_schedule function
  ipvs: replace ip_vs_fill_ip4hdr with ip_vs_fill_iph_skb_off
  ipvs: Add hdr_flags to iphdr
  ipvs: drop inverse argument to conn_{in,out}_get
  ipvs: Make ip_vs_schedule aware of inverse iph'es
  ipvs: add schedule_icmp sysctl
  ipvs: Use outer header in ip_vs_bypass_xmit_v6
  ipvs: attempt to schedule icmp packets
  ipvs: ensure that ICMP cannot be sent in reply to ICMP
  ipvs: support scheduling inverse and icmp TCP packets
  ipvs: support scheduling inverse and icmp UDP packets
  ipvs: support scheduling inverse and icmp SCTP packets

 include/net/ip_vs.h                     | 101 ++++++++++++-----
 net/netfilter/ipvs/ip_vs_conn.c         |  12 +-
 net/netfilter/ipvs/ip_vs_core.c         | 190 +++++++++++++++++++-------------
 net/netfilter/ipvs/ip_vs_ctl.c          |   8 +-
 net/netfilter/ipvs/ip_vs_proto_ah_esp.c |  17 ++-
 net/netfilter/ipvs/ip_vs_proto_sctp.c   |  35 ++++--
 net/netfilter/ipvs/ip_vs_proto_tcp.c    |  37 +++++--
 net/netfilter/ipvs/ip_vs_proto_udp.c    |  26 ++++-
 net/netfilter/ipvs/ip_vs_xmit.c         |   9 +-
 9 files changed, 287 insertions(+), 148 deletions(-)

Alex Gartrell <agartrell@xxxxxx>

To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at

<Prev in Thread] Current Thread [Next in Thread>