LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

Re: [PATCH net v2] ipvs: fix MTU check for GSO packets in tunnel mode

To: Yingnan Zhang <342144303@xxxxxx>
Subject: Re: [PATCH net v2] ipvs: fix MTU check for GSO packets in tunnel mode
Cc: horms@xxxxxxxxxxxx, pablo@xxxxxxxxxxxxx, fw@xxxxxxxxx, phil@xxxxxx, davem@xxxxxxxxxxxxx, edumazet@xxxxxxxxxx, kuba@xxxxxxxxxx, pabeni@xxxxxxxxxx, netdev@xxxxxxxxxxxxxxx, lvs-devel@xxxxxxxxxxxxxxx, netfilter-devel@xxxxxxxxxxxxxxx, coreteam@xxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
From: Julian Anastasov <ja@xxxxxx>
Date: Thu, 2 Apr 2026 15:59:10 +0300 (EEST)
        Hello,

On Thu, 2 Apr 2026, Yingnan Zhang wrote:

> Currently, IPVS skips MTU checks for GSO packets by excluding them with
> the !skb_is_gso(skb) condition in both IPv4 and IPv6 code paths. This
> creates problems when IPVS tunnel mode encapsulates GSO packets with
> IPIP or IPv6 tunnel headers.
> 
> The issue manifests in two ways:
> 
> 1. MTU violation after encapsulation:
>    When a GSO packet passes through IPVS tunnel mode, the original MTU
>    check is bypassed. After adding the tunnel header, the packet size
>    may exceed the outgoing interface MTU, leading to unexpected
>    fragmentation at the IP layer.
> 
> 2. Fragmentation with problematic IP IDs:
>    When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
>    is fragmented after encapsulation, each segment gets a sequentially
>    incremented IP ID (0, 1, 2, ...). This happens because:
> 
>    a) The GSO packet bypasses MTU check and gets encapsulated
>    b) At __ip_finish_output, the oversized GSO packet is split into
>       separate SKBs (one per segment), with IP IDs incrementing
>    c) Each SKB is then fragmented again based on the actual MTU
> 
>    This sequential IP ID allocation differs from the expected behavior
>    and can cause issues with fragment reassembly and packet tracking.
> 
> Fix this by removing the GSO packet exception from the MTU check in both
> IPv4 and IPv6 paths, and properly validating GSO packets using
> skb_gso_validate_network_len(). The condition is refactored to avoid
> code duplication.
> 
> Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation 
> handling")
> Signed-off-by: Yingnan Zhang <342144303@xxxxxx>
> ---
> Changes in v2:
> - Added IPv6 fix in __mtu_check_toobig_v6() per Julian's review
> - Refactored to avoid code duplication per Julian's suggestion
> - Applied same validation pattern to both IPv4 and IPv6 paths
> 
> v1: https://lore.kernel.org/netdev/20260401152228.31190-1-342144303@xxxxxx/
> 
>  net/netfilter/ipvs/ip_vs_xmit.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
> index 3601eb86d..ac2ad7518 100644
> --- a/net/netfilter/ipvs/ip_vs_xmit.c
> +++ b/net/netfilter/ipvs/ip_vs_xmit.c
> @@ -112,7 +112,8 @@ __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu)
>               if (IP6CB(skb)->frag_max_size > mtu)
>                       return true; /* largest fragment violate MTU */
>       }

        You should remove the above line because compilation fails...

> -     else if (skb->len > mtu && !skb_is_gso(skb)) {
> +     } else if (skb->len > mtu &&
> +                !(skb_is_gso(skb) && skb_gso_validate_network_len(skb, 
> mtu))) {
>               return true; /* Packet size violate MTU size */
>       }
>       return false;
> @@ -232,8 +233,9 @@ static inline bool ensure_mtu_is_adequate(struct 
> netns_ipvs *ipvs, int skb_af,
>                       return true;
>  
>               if (unlikely(ip_hdr(skb)->frag_off & htons(IP_DF) &&
> -                          skb->len > mtu && !skb_is_gso(skb) &&
> -                          !ip_vs_iph_icmp(ipvsh))) {
> +                  skb->len > mtu && !ip_vs_iph_icmp(ipvsh) &&
> +                  !(skb_is_gso(skb) &&
> +                    skb_gso_validate_network_len(skb, mtu)))) {

        Please keep the indentation, just like in my example

>                       icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
>                                 htonl(mtu));
>                       IP_VS_DBG(1, "frag needed for %pI4\n",
> -- 
> 2.51.0

Regards

--
Julian Anastasov <ja@xxxxxx>



<Prev in Thread] Current Thread [Next in Thread>