LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

[lvs-users] lvs tun and ipip fragments

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: [lvs-users] lvs tun and ipip fragments
From: Kelsey Cummings <kgc@xxxxxxxxxxxxxx>
Date: Wed, 9 May 2012 17:36:24 -0700
Issues with LVS-Tun, PMTUD and MSS fixup seem to come up periodically.
We want to use LVS-Tun but do not want to end up in a situation where
we're relying on functional PMTUD or selective MSS fixup on the real 
servers.  The main issue being that if either of these fail to function
a client will end up in a situation where their sessions may hang in
such a way nothing short of a co-op testing with tpcdumps would reveal
the cause of the problem.

It seems that a more obvious solution to this is to allow the kernel to 
frag the IPIP as needed by clearing the DF bit on the packet and
skipping the MTU exceeded check.  This is a technical violation of RFC
2003 but under some circumstances it is advantageous to just let it
fragment.  Any additional overhead of handling the frags is relatively 
insignificant and we end up able to handle ~100mbits of traffic inbound
per real server before there is likely to be a collision in 
fragmentation reassembly, and even then, only if packets arrive at the
real server out of order.

The patch to hack this into the existing code is only two lines long and
appears to work correctly in limited testing.  A sysctl variable to
control the behavior would be easy enough.

Thoughts?

--- 
/root/rpmbuild/SOURCES/linux-2.6.32-220.13.1.el6/net/netfilter/ipvs/ip_vs_xmit.c
    2009-12-02 19:51:21.000000000 -0800
+++ net/netfilter/ipvs/ip_vs_xmit.c     2012-05-09 17:24:05.180140929 -0700
@@ -559,6 +559,9 @@
        if (skb_dst(skb))
                skb_dst(skb)->ops->update_pmtu(skb_dst(skb), mtu);
 
+       //clear the DF bit so the kernel will frag the packet
+       old_iph->frag_off = 0;
+
        df |= (old_iph->frag_off & htons(IP_DF));
 
        if ((old_iph->frag_off & htons(IP_DF))
@@ -608,7 +611,7 @@
        iph                     =       ip_hdr(skb);
        iph->version            =       4;
        iph->ihl                =       sizeof(struct iphdr)>>2;
-       iph->frag_off           =       df;
+       iph->frag_off           =       0;
        iph->protocol           =       IPPROTO_IPIP;
        iph->tos                =       tos;
        iph->daddr              =       rt->rt_dst;

-- 
Kelsey Cummings - kgc@xxxxxxxxxxxxxx      sonic.net, inc.
System Architect                          2260 Apollo Way
707.522.1000                              Santa Rosa, CA 95407

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>