LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

[lvs-devel] crashing kernel with lvs as transparent squid proxy

Subject: [lvs-devel] crashing kernel with lvs as transparent squid proxy
From: horms at verge.net.au (Horms)
Date: Thu, 13 Sep 2007 10:58:07 +0900
On Tue, Sep 11, 2007 at 03:31:48PM +0200, Peter Warasin wrote:
> hi people
> 
> I made some modifications on the lvs specific kernel code, which now
> leads into kernel oops. Could someone give me some pointers about how to
> find the bug? I am not very familiar with the kernel code, so maybe i
> missed some simple tricks which routined people know and me not.
> 
> Basically i altered the lvs code in order to make it catch packets
> within the PREROUTING chain instead of the INPUT chain. My setup works,
> but sometimes i have a kernel oops.
> 
> I think somewhere it lacks some sort of spinlock, but i not really know
> where to begin in order to find where it must be inserted.

Hi Peter,

As Joe mentioned in a subsequent email, being able to move LVS from one
chain to another is something that we are interested in.  In particular
I am of the believe that the FORWARD chain would be a much more logical
home than LOCAL_IN as in some ways would allow LVS to act more like a
router than a proxy (not that it is a proxy, but it kind of behaves like
one in some ways because of its home on LOCAL_IN).

As I recall, I did try moving the code to the FORWARD chan a long time
ago. I believe that the change was very similar to the LOCAL_IN to
PRE_ROUTING snippet that you have below. I'm not sure that I ever posted
the change, as I never tested it thorougly. So perhaps it too broke
occasionally. In any case, this was a long time ago, and the kernel
has changed significantly then, so any testing done at that time
wouldn't really hold water now (incidently 2.6.9 is also pretty old,
though I'm not sure what patches RHEL include to modernise it).

As for debugging your problem. Providing the oops message - if any -
might help. Hopefully there is a stack trace in there and that should
start to point to where the problem is.

Some portions of the locking schemantics of LVS are non-trivial and
I have a deep suspicion that there are some races in there anyway.
By which I mean, don't be surprised if things get a bit hairy as you
are tracing through what is going on.

If your kernel is compiled with IP_VS_DEBUG then you can enable
and adjust the verbosity of debugging messages that LVS produced
by tweaking /proc/sys/net/ipv4/vs/debug_level as documented in
Documentation/networking/ipvs-sysctl.txt in the kernel source tree.

Also, if you are doing development work, I do recommend considering
using a more up to date kernel. Perhaps the latest rc kernel, currently
2.6.23-rc6. I'm not suggsting that you neccessarily drop this into
production. But for development work, it is much easier to work with
the kernel guys if you are on the same page as them.

Lastly, I think that you forgot to attach the patch :-)

[snip]

> 1.)
> --- linux-2.6.9/net/ipv4/ipvs/ip_vs_core.c.orig 2007-07-30
> 20:40:31.000000000 +0200
> +++ linux-2.6.9/net/ipv4/ipvs/ip_vs_core.c      2007-07-30
> 20:40:37.000000000 +0200
> @@ -1095,7 +1095,7 @@
>         .hook           = ip_vs_in,
>         .owner          = THIS_MODULE,
>         .pf             = PF_INET,
> -       .hooknum        = NF_IP_LOCAL_IN,
> +       .hooknum        = NF_IP_PRE_ROUTING,
>         .priority       = 100,
>  };

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/



<Prev in Thread] Current Thread [Next in Thread>