LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Transparent cache cluster, patch Jinhua Luo

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Transparent cache cluster, patch Jinhua Luo
From: "Diego Larios" <diegolarios@xxxxxxxxx>
Date: Wed, 3 Jan 2007 09:30:41 -0600
Also, I have been analyzing this and I see that in the FORWARD table I don't
receive any package which is the table that marks the packets, and I have
ip_forward set to 1.

He wrote this:

Transparent cache, taking transparent web cluster for example, seems like
below topology or deployment: Given client machines make ipvs director
default gateway, http requests toward internet web sites from the clients
forward through the ipvs director. Obviously, the director must be enable to
capture those packets which are involved in some virtual service defined in
the ipvs rule table, though they exist in FORWARD chain which go beyond the
power control of ipvs, because the only outside-inside function
entrance is ip_vs_in()
which only hooks in INPUT chain, right? As long as ip_vs_in() handles these
packets, ipvs can finally forward (VS/DR) them to the backend real servers.
The real servers use the REDIRECT iptable rule to let local squid adopt
them, and at last, client machines get the correct http responses.

LVS HOWTO provides two ways to meet our need, but I found they don't work.

1. define a route in local route table (in "17.3. How you use TP" chapter)

I test this way in Redhat AS4.0_update2 which uses 2.6.x kernel, but nothing
magic happens, :-( Maybe 2.6.x kernel does not support this trick. Even this
can work, but how can you capture all packets to the whole internet web
sites in your local route table? :-)

2. REDIRECT iptable rule
Let's see how REDIRECT rule work in 2.6.x kernel first.

Given such a rule configured on ipvs director:

# iptables -t mangle -A PREROUTING -p tcp -s <internal network> --dport 80
-j REDIRECT --to-ports 3128 When a packet [cip:cport -> website_ip:80] comes
in, netfilter redirects it from PREROUTING chain to INPUT chain, and it
becomes [cip:cport -> localhost_ip:3128], and then it's handled by squid
(with transparent proxy settings). Squid gets the webpage from cache or from
internet, and send back the webpage to the client. When the response
packet [localhost_ip:3128
-> cip:cport] flows at the end of POSTROUTING chain, netfilter recovers its
origin, then the packet becomes [website_ip:80 -> cip:cport] and go toward
the client machine on the wire. We can see that, netfilter MASQUERADEs the
packet both on PREROUTING and POSTROUTING chains. However, it is useless to
ipvs director. First, the packets which ip_vs_in() handles own the unique
localhost ip as the destination ip, and that make those schedule algorithms
like lblc, lblcr, dh (dedicated to cache cluster) useless! Second, ipvs
registers ip_vs_post_routing(), which hooks on POSTROUTING chain, with a
priority of NF_IP_PRI_NAT_SRC-1, which means it is called by netfilter core
before NAT hooks like REDIRECT. This function returns NF_STOP for those
packets which ipvs_property is set, and REDIRECT has no chance to MASQUERADE
those packets. Finally, client receives the response, but it dost not match
the request on the orign ip:port!



I think out a solution for the support of TP in ipvs natively, and make a
patch:


--- linux-2.6.16/net/ipv4/ipvs/ip_vs_core.c 2006-03-20
13:53:29.000000000+0800 +++
linux-2.6.16.new/net/ipv4/ipvs/ip_vs_core.c 2006-11-21 18:28:35.000000000+0800

@@ -23,6 +23,8 @@
* Changes:
*    Paul `Rusty' Russell        properly handle non-linear skbs
*    Harald Welte            don't use nfcache

+ * Jinhua Luo redirect packets with fwmark on NF_IP_FORWARD chain + * to
ip_vs_in(), mainly for transparent cache cluster

*
*/

@@ -1060,6 +1062,16 @@
   return ip_vs_in_icmp(pskb, &r, hooknum);
}

+static unsigned int
+ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb,
+           const struct net_device *in, const struct net_device *out,
+           int (*okfn)(struct sk_buff *))
+{
+    if ((*pskb)->ipvs_property || ! (*pskb)->nfmark)
+            return NF_ACCEPT;
+
+    return ip_vs_in(hooknum, pskb, in, out, okfn);
+}

/* After packet filtering, forward packet through VS/DR, VS/TUN,
  or VS/NAT(change destination), so that filtering rules can be
@@ -1072,6 +1084,14 @@
   .priority       = 100,
};

+static struct nf_hook_ops ip_vs_forward_with_fwmark_ops = {
+    .hook        = ip_vs_forward_with_fwmark,
+    .owner        = THIS_MODULE,
+    .pf        = PF_INET,
+    .hooknum        = NF_IP_FORWARD,
+    .priority       = 101,
+};
+
/* After packet filtering, change source only for VS/NAT */
static struct nf_hook_ops ip_vs_out_ops = {
   .hook        = ip_vs_out,
@@ -1150,9 +1170,17 @@
       goto cleanup_postroutingops;
   }

+    ret = nf_register_hook(&ip_vs_forward_with_fwmark_ops);
+    if (ret < 0) {
+        IP_VS_ERR("can't register forward_with_fwmark hook.\n");
+        goto cleanup_forwardicmpops;
+    }
+
   IP_VS_INFO("ipvs loaded.\n");
   return ret;

+  cleanup_forwardicmpops:
+    nf_unregister_hook(&ip_vs_forward_icmp_ops);
 cleanup_postroutingops:
   nf_unregister_hook(&ip_vs_post_routing_ops);
 cleanup_outops:
@@ -1172,6 +1200,7 @@

static void __exit ip_vs_cleanup(void)
{
+    nf_unregister_hook(&ip_vs_forward_with_fwmark_ops);
   nf_unregister_hook(&ip_vs_forward_icmp_ops);
   nf_unregister_hook(&ip_vs_post_routing_ops);
   nf_unregister_hook(&ip_vs_out_ops);


Here, I redirect the packets with fwmark to ip_vs_in() on the FORWARD chain,
and ip_vs_in() will handle the packets which are marked by iptables to
indicate transparent cache virtual service, but ignore other packets (let
them continue to flow on the FORWARD chain).

Now you can use ipvs to deploy TP at ease:
@ ipvs director
# sysctl -w net.ipv4.ip_forward=1

# iptables -t mangle -A FORWARD -p tcp -s <internal network> --dport 80 -j
MARK --set-mark 1

# ipvsadm -A -f 1 -s lblcr
# ipvsadm -a -f 1 -r RS1
# ipvsadm -a -f 1 -r RS2

@ RS

# iptables -t mangle -A PREROUTING -p tcp -s <internal network> --dport 80
-j REDIRECT --to-ports 3128

# cat >> /etc/squid/squid.conf << EOF
httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on
# /etc/init.d/squid start



On 1/3/07, Joseph Mack NA3T <jmack@xxxxxxxx> wrote:

On Tue, 2 Jan 2007, Diego Larios wrote:

> Hello, l looked at the patch for transparent cache that Jinhua Luo made;
I
> applied it but nothing happens, it doesn't work, do I have to make
something
> else to make it work ? Thanks.

hmm, he gave a bit of a write up on using it when he posted
(I thought). I didn't take any notice of it, since it wasn't
obvious at least at the time, that it was going into the
kernel. It now seems likely that it will go in the kernel,
and I should track down the info and write it up. I guess
I've been asleep at the wheel, sorry. Can you dig up his
original posting(s) and see what he said?

Thanks Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://www.in-addr.de/mailman/listinfo/lvs-users


<Prev in Thread] Current Thread [Next in Thread>