One of our customers wants to get an additional, lower bandwidth IP
connection (to be used in conjunction with a low TTL, server monitoring
DNS server) as a cheapish way of ensuring that the site is reasonably
available in the event of bandwidth provider breakage.
The setup is currently using LVS NAT in a standard configuration, e.g.:
ipvsadm -A -t website_ip:80 -s rr
ipvsadm -a -t website_ip:80 -r rs1:80 -m
ipvsadm -a -t website_ip:80 -r rs2:80 -m
...
where website_ip = the external ip address of the service, and the ip
addresses of the real servers are assigned from private ip space.
My idea was to simply add the extra ip addresses in as separate load
balanced services, and then use something like:
ip rule from backup_ip table backup_route
ip route add default backup_gw table backup_route
This works fine for non-LVS services (and I can therefore provide a
straightforward NAT service without redundancy), but with LVS services
the traffic is pushed straight down the default route. I'm guessing
that this is because the packets are routed before the NAT happens. A
few questions:
- Am I right therefore in thinking that this would work with LVS/DR?
- Can anyone think of another method of using LVS-NAT to get these
packets to take the right route?
Digging around a little I thought that the old antefacto patches might
sort this out, and in fact, they do. However, they are unfortunately
unstable (in testing, they seemed fine, but with real traffic the box
just drops off the network, presumably with a kernel oops that I can't
see as it is in some hosting centre miles away). Reading those a bit
further, there is a particular section that would seem to be just what I
want:
/*
* It is hooked at the NF_IP_FORWARD chain, used only for VS/NAT.
@@ -642,6 +686,7 @@ static unsigned int ip_vs_out(unsigned i
struct ip_vs_conn *cp;
int size;
int ihl;
+ int retval;
EnterFunction(11);
@@ -809,8 +854,20 @@ static unsigned int ip_vs_out(unsigned i
skb->nfcache |= NFC_IPVS_PROPERTY;
+ /* For policy routing, packets originating from this
+ * machine itself may be routed differently to packets
+ * passing through. We want this packet to be routed as
+ * if it came from this machine itself. So re-compute
+ * the routing information.
+ */
+ if (route_me_harder(skb) == 0)
+ retval = NF_ACCEPT;
+ else
+ /* No route available; what can we do? */
+ retval = NF_DROP;
+
LeaveFunction(11);
- return NF_ACCEPT;
+ return retval;
}
I believe that this is just rerouting the packet after the NAT rewrite
has taken place. Can any kernel experts see any problems with this
approach? Should I apply the same change to ip_vs_out_icmp?
Thanks,
Mark
The route function is:
+/* This code stolen from ip_nat_standalone.c, as is the
+ * following comment:
+ *
+ * FIXME: change in oif may mean change in hh_len. Check and realloc
+ * --RR
+ * (
+ * note from Joe: function name retained for compatibility with Rusty's
code
+ * - in recent kernels has been moved to a different file and called
ip_route_me_harder()
+ * )
+ */
+static int
+route_me_harder(struct sk_buff *skb)
+{
+ struct iphdr *iph = skb->nh.iph;
+ struct rtable *rt;
+ struct rt_key key = { dst:iph->daddr,
+ src:iph->saddr,
+ oif:skb->sk ? skb->sk->bound_dev_if : 0,
+ tos:RT_TOS(iph->tos)|RTO_CONN,
+#ifdef CONFIG_IP_ROUTE_FWMARK
+ fwmark:skb->nfmark
+#endif
+ };
+
+ /* Note that ip_route_output_key() makes routing
+ * decisions assuming that the packet has originated
+ * from this machine itself. This is the correct
+ * behaviour for our case.
+ */
+ if (ip_route_output_key(&rt, &key) != 0) {
+ printk("route_me_harder(): No more route.\n");
+ return -EINVAL;
+ }
+
+ /* Drop old route. */
+ dst_release(skb->dst);
+
+ skb->dst = &rt->u.dst;
+ return 0;
+}
+
|