how to support transparent cache cluster in ipvs?

To:	lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject:	how to support transparent cache cluster in ipvs?
From:	home_king <home_king@xxxxxxx>
Date:	Fri, 24 Nov 2006 11:47:03 +0800

Transparent cache, taking transparent web cluster for example, seemslike below topology or deployment:

Given client machines make ipvs director default gateway, http requeststoward internet web sites from the clients forward through the ipvsdirector. Obviously, the director must be enable to capture thosepackets which are involved in some virtual service defined in the ipvsrule table, though they exist in FORWARD chain which go beyond the powercontrol of ipvs, because the only outside-inside function entrance isip_vs_in() which only hooks in INPUT chain, right? As long as ip_vs_in()handles these packets, ipvs can finally forward (VS/DR) them to thebackend real servers. The real servers use the REDIRECT iptable rule tolet local squid adopt them, and at last, client machines get the correcthttp responses.


LVS HOWTO provides two ways to meet our need, but I found they don't work.

1. define a route in local route table (in "17.3. How you use TP" chapter)

I test this way in Redhat AS4.0_update2 which uses 2.6.x kernel, butnothing magic happens, :-( Maybe 2.6.x kernel does not support this trick.Even this can work, but how can you capture all packets to the wholeinternet web sites in your local route table? :-)


2. REDIRECT iptable rule
Let's see how REDIRECT rule work in 2.6.x kernel first.

Given such a rule configured on ipvs director:

# iptables -t mangle -A PREROUTING -p tcp -s <internal network> --dport80 -j REDIRECT --to-ports 3128

When a packet [cip:cport -> website_ip:80] comes in, netfilter redirectsit from PREROUTING chain to INPUT chain, and it becomes [cip:cport ->localhost_ip:3128], and then it's handled by squid (with transparentproxy settings). Squid gets the webpage from cache or from internet, andsend back the webpage to the client. When the response packet[localhost_ip:3128 -> cip:cport] flows at the end of POSTROUTING chain,netfilter recovers its origin, then the packet becomes [website_ip:80 ->cip:cport] and go toward the client machine on the wire.

We can see that, netfilter MASQUERADEs the packet both on PREROUTING andPOSTROUTING chains. However, it is useless to ipvs director.

First, the packets which ip_vs_in() handles own the unique localhost ipas the destination ip, and that make those schedule algorithms likelblc, lblcr, dh (dedicated to cache cluster) useless!

Second, ipvs registers ip_vs_post_routing(), which hooks on POSTROUTINGchain, with a priority of NF_IP_PRI_NAT_SRC-1, which means it is calledby netfilter core before NAT hooks like REDIRECT. This function returnsNF_STOP for those packets which ipvs_property is set, and REDIRECT hasno chance to MASQUERADE those packets. Finally, client receives theresponse, but it dost not match the request on the orign ip:port!

I think out a solution for the support of TP in ipvs natively, and makea patch:

--- linux-2.6.16/net/ipv4/ipvs/ip_vs_core.c 2006-03-2013:53:29.000000000 +0800+++ linux-2.6.16.new/net/ipv4/ipvs/ip_vs_core.c 2006-11-2118:28:35.000000000 +0800

@@ -23,6 +23,8 @@
 * Changes:
 *    Paul `Rusty' Russell        properly handle non-linear skbs
 *    Harald Welte            don't use nfcache

+ * Jinhua Luo redirect packets with fwmark onNF_IP_FORWARD chain+ * to ip_vs_in(), mainly fortransparent cache cluster

 *
 */

@@ -1060,6 +1062,16 @@
    return ip_vs_in_icmp(pskb, &r, hooknum);
}

+static unsigned int
+ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb,
+           const struct net_device *in, const struct net_device *out,
+           int (*okfn)(struct sk_buff *))
+{
+    if ((*pskb)->ipvs_property || ! (*pskb)->nfmark)
+            return NF_ACCEPT;
+
+    return ip_vs_in(hooknum, pskb, in, out, okfn);
+}

/* After packet filtering, forward packet through VS/DR, VS/TUN,
   or VS/NAT(change destination), so that filtering rules can be
@@ -1072,6 +1084,14 @@
    .priority       = 100,
};

+static struct nf_hook_ops ip_vs_forward_with_fwmark_ops = {
+    .hook        = ip_vs_forward_with_fwmark,
+    .owner        = THIS_MODULE,
+    .pf        = PF_INET,
+    .hooknum        = NF_IP_FORWARD,
+    .priority       = 101,
+};
+
/* After packet filtering, change source only for VS/NAT */
static struct nf_hook_ops ip_vs_out_ops = {
    .hook        = ip_vs_out,
@@ -1150,9 +1170,17 @@
        goto cleanup_postroutingops;
    }

+    ret = nf_register_hook(&ip_vs_forward_with_fwmark_ops);
+    if (ret < 0) {
+        IP_VS_ERR("can't register forward_with_fwmark hook.\n");
+        goto cleanup_forwardicmpops;
+    }
+
    IP_VS_INFO("ipvs loaded.\n");
    return ret;

+  cleanup_forwardicmpops:
+    nf_unregister_hook(&ip_vs_forward_icmp_ops);
  cleanup_postroutingops:
    nf_unregister_hook(&ip_vs_post_routing_ops);
  cleanup_outops:
@@ -1172,6 +1200,7 @@

static void __exit ip_vs_cleanup(void)
{
+    nf_unregister_hook(&ip_vs_forward_with_fwmark_ops);
    nf_unregister_hook(&ip_vs_forward_icmp_ops);
    nf_unregister_hook(&ip_vs_post_routing_ops);
    nf_unregister_hook(&ip_vs_out_ops);

Here, I redirect the packets with fwmark to ip_vs_in() on the FORWARDchain, and ip_vs_in() will handle the packets which are marked byiptables to indicate transparent cache virtual service, but ignore otherpackets (let them continue to flow on the FORWARD chain).


Now you can use ipvs to deploy TP at ease:
@ ipvs director
# sysctl -w net.ipv4.ip_forward=1

# iptables -t mangle -A FORWARD -p tcp -s <internal network> --dport 80-j MARK --set-mark 1

# ipvsadm -A -f 1 -s lblcr
# ipvsadm -a -f 1 -r RS1
# ipvsadm -a -f 1 -r RS2

@ RS

# iptables -t mangle -A PREROUTING -p tcp -s <internal network> --dport80 -j REDIRECT --to-ports 3128

# cat >> /etc/squid/squid.conf << EOF
 httpd_accel_host virtual
 httpd_accel_port 80
 httpd_accel_with_proxy on
 httpd_accel_uses_host_header on
# /etc/init.d/squid start

<Prev in Thread]	Current Thread	[Next in Thread>
how to support transparent cache cluster in ipvs?, home_king <= Re: how to support transparent cache cluster in ipvs?, Horms Re: how to support transparent cache cluster in ipvs?, Horms Re: how to support transparent cache cluster in ipvs?, home_king Re: how to support transparent cache cluster in ipvs?, Horms

Previous by Date:	how to support transparent cache cluster in ipvs?, home_king
Next by Date:	LVS dns cluster, Simon Pearce
Previous by Thread:	how to support transparent cache cluster in ipvs?, home_king
Next by Thread:	Re: how to support transparent cache cluster in ipvs?, Horms
Indexes:	[Date] [Thread] [Top] [All Lists]