On Tue, Nov 28, 2006 at 03:19:29PM +0900, Horms wrote:
> Hi,
>
> this patch seems pretty nice to me, and it seems that it should work
> quite well. Have you tested it? If so, could you provide a signed-off-by
> line, as described in section 5 of http://linux.yyz.us/patch-format.html
> so that I can submit it to netdev for inclusion in the kernel.
>
> I have reformated the patch a bit, it is below. Feel free
> to rework the comment if you like.
Hi,
here is an update to this patch, as the version I posted yesterday
was missing the ip_vs_forward_with_fwmark_ops fragment.
I have also added some comments, which are in keeping with comments
for similar functions and structures.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
[IPVS] transparent proxying
Patch from home_king <home_king@xxxxxxx> to allow a web cluseter using
transparent proxying. It works by simply grabing packets that have the
fwmark set and have not already been processed by ipvs (ip_vs_out) and
throwing them into ip_vs_in.
See: http://archive.linuxvirtualserver.org/html/lvs-users/2006-11/msg00261.html
Normally LVS packets are processed by ip_vs_in fron on the INPUT chain,
and packets that are processed in this way never show up on the FORWARD
chain, so they won't hit this rule.
This patch seems like a good precursor to moving LVS permanantly to
the FORWARD chain. As I'm struggling to think how it could break things.
Reformated to use tabs for indentation (instead of 4 spaces)
Reformated to be < 80 columns wide
Cc: Jinhua Luo <home_king@xxxxxxx>
Signed-off-by: Simon Horman <horms@xxxxxxxxxxxx>
Index: linux-2.6/net/ipv4/ipvs/ip_vs_core.c
===================================================================
--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_core.c 2006-11-28 15:30:00.000000000
+0900
+++ linux-2.6/net/ipv4/ipvs/ip_vs_core.c 2006-11-29 10:27:49.000000000
+0900
@@ -23,7 +23,9 @@
* Changes:
* Paul `Rusty' Russell properly handle non-linear skbs
* Harald Welte don't use nfcache
- *
+ * Jinhua Luo redirect packets with fwmark on
+ * NF_IP_FORWARD chain to ip_vs_in(),
+ * mainly for transparent cache cluster
*/
#include <linux/module.h>
@@ -1070,6 +1072,26 @@
return ip_vs_in_icmp(pskb, &r, hooknum);
}
+/*
+ * This is hooked into the NF_IP_FORWARD. It catches
+ * packets that have not already been handled by ipvs (out)
+ * and have a fwmark set. This is to allow transparent proxying
+ * of fwmark virtual services.
+ *
+ * It will not process packets that are handled by ipvs (in)
+ * as they never traverse the NF_IP_FORWARD.
+ */
+static unsigned int
+ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb,
+ const struct net_device *in,
+ const struct net_device *out,
+ int (*okfn)(struct sk_buff *))
+{
+ if ((*pskb)->ipvs_property || ! (*pskb)->nfmark)
+ return NF_ACCEPT;
+
+ return ip_vs_in(hooknum, pskb, in, out, okfn);
+}
/* After packet filtering, forward packet through VS/DR, VS/TUN,
or VS/NAT(change destination), so that filtering rules can be
@@ -1082,6 +1104,16 @@
.priority = 100,
};
+/* Allow transparent proxying by fishing packets
+ * out of the forward chain. */
+static struct nf_hook_ops ip_vs_forward_with_fwmark_ops = {
+ .hook = ip_vs_forward_with_fwmark,
+ .owner = THIS_MODULE,
+ .pf = PF_INET,
+ .hooknum = NF_IP_FORWARD,
+ .priority = 101,
+};
+
/* After packet filtering, change source only for VS/NAT */
static struct nf_hook_ops ip_vs_out_ops = {
.hook = ip_vs_out,
@@ -1160,9 +1192,17 @@
goto cleanup_postroutingops;
}
+ ret = nf_register_hook(&ip_vs_forward_with_fwmark_ops);
+ if (ret < 0) {
+ IP_VS_ERR("can't register forward_with_fwmark hook.\n");
+ goto cleanup_forwardicmpops;
+ }
+
IP_VS_INFO("ipvs loaded.\n");
return ret;
+ cleanup_forwardicmpops:
+ nf_unregister_hook(&ip_vs_forward_icmp_ops);
cleanup_postroutingops:
nf_unregister_hook(&ip_vs_post_routing_ops);
cleanup_outops:
@@ -1182,6 +1222,7 @@
static void __exit ip_vs_cleanup(void)
{
+ nf_unregister_hook(&ip_vs_forward_with_fwmark_ops);
nf_unregister_hook(&ip_vs_forward_icmp_ops);
nf_unregister_hook(&ip_vs_post_routing_ops);
nf_unregister_hook(&ip_vs_out_ops);
|