LVS NAT and source address routing/antefacto patches

To:	LVS Users <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	LVS NAT and source address routing/antefacto patches
From:	Mark Weaver <mark@xxxxxxxxxx>
Date:	Thu, 15 Jul 2004 17:37:37 +0100

One of our customers wants to get an additional, lower bandwidth IPconnection (to be used in conjunction with a low TTL, server monitoringDNS server) as a cheapish way of ensuring that the site is reasonablyavailable in the event of bandwidth provider breakage.


The setup is currently using LVS NAT in a standard configuration, e.g.:

ipvsadm -A -t website_ip:80 -s rr
ipvsadm -a -t website_ip:80 -r rs1:80 -m
ipvsadm -a -t website_ip:80 -r rs2:80 -m
...

where website_ip = the external ip address of the service, and the ipaddresses of the real servers are assigned from private ip space.

My idea was to simply add the extra ip addresses in as separate loadbalanced services, and then use something like:


ip rule from backup_ip table backup_route
ip route add default backup_gw table backup_route

This works fine for non-LVS services (and I can therefore provide astraightforward NAT service without redundancy), but with LVS servicesthe traffic is pushed straight down the default route. I'm guessingthat this is because the packets are routed before the NAT happens. Afew questions:


- Am I right therefore in thinking that this would work with LVS/DR?

- Can anyone think of another method of using LVS-NAT to get thesepackets to take the right route?

Digging around a little I thought that the old antefacto patches mightsort this out, and in fact, they do. However, they are unfortunatelyunstable (in testing, they seemed fine, but with real traffic the boxjust drops off the network, presumably with a kernel oops that I can'tsee as it is in some hosting centre miles away). Reading those a bitfurther, there is a particular section that would seem to be just what Iwant:



 /*
  *     It is hooked at the NF_IP_FORWARD chain, used only for VS/NAT.
@@ -642,6 +686,7 @@ static unsigned int ip_vs_out(unsigned i
        struct ip_vs_conn *cp;
        int size;
        int ihl;
+       int retval;

        EnterFunction(11);

@@ -809,8 +854,20 @@ static unsigned int ip_vs_out(unsigned i

        skb->nfcache |= NFC_IPVS_PROPERTY;

+        /* For policy routing, packets originating from this
+         * machine itself may be routed differently to packets
+         * passing through.  We want this packet to be routed as
+         * if it came from this machine itself.  So re-compute
+         * the routing information.
+         */
+        if (route_me_harder(skb) == 0)
+            retval = NF_ACCEPT;
+        else
+            /* No route available; what can we do? */
+            retval = NF_DROP;
+
        LeaveFunction(11);
-       return NF_ACCEPT;
+       return retval;
 }

I believe that this is just rerouting the packet after the NAT rewritehas taken place. Can any kernel experts see any problems with thisapproach? Should I apply the same change to ip_vs_out_icmp?


Thanks,

Mark


The route function is:

+/* This code stolen from ip_nat_standalone.c, as is the
+ * following comment:
+ *
+ * FIXME: change in oif may mean change in hh_len.  Check and realloc
+ * --RR
+ * (

+ * note from Joe: function name retained for compatibility with Rusty'scode+ * - in recent kernels has been moved to a different file and calledip_route_me_harder()

+ * )
+ */
+static int
+route_me_harder(struct sk_buff *skb)
+{
+       struct iphdr *iph = skb->nh.iph;
+       struct rtable *rt;
+       struct rt_key key = { dst:iph->daddr,
+                             src:iph->saddr,
+                             oif:skb->sk ? skb->sk->bound_dev_if : 0,
+                             tos:RT_TOS(iph->tos)|RTO_CONN,
+#ifdef CONFIG_IP_ROUTE_FWMARK
+                             fwmark:skb->nfmark
+#endif
+                           };
+
+        /* Note that ip_route_output_key() makes routing
+         * decisions assuming that the packet has originated
+         * from this machine itself.  This is the correct
+         * behaviour for our case.
+         */
+       if (ip_route_output_key(&rt, &key) != 0) {
+               printk("route_me_harder(): No more route.\n");
+               return -EINVAL;
+       }
+
+       /* Drop old route. */
+       dst_release(skb->dst);
+
+       skb->dst = &rt->u.dst;
+       return 0;
+}
+

<Prev in Thread]	Current Thread	[Next in Thread>
LVS NAT and source address routing/antefacto patches, Mark Weaver <= Re: LVS NAT and source address routing/antefacto patches, Julian Anastasov

Previous by Date:	Re: ICMP LVS/DR on Gateway, Joseph Mack
Next by Date:	Re: LVS NAT and source address routing/antefacto patches, Julian Anastasov
Previous by Thread:	ICMP LVS/DR on Gateway, Kimitoshi Takahashi(Mobile)
Next by Thread:	Re: LVS NAT and source address routing/antefacto patches, Julian Anastasov
Indexes:	[Date] [Thread] [Top] [All Lists]