LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Patch: connection-tracking with LVS-NAT

To: Ben North <ben@xxxxxxxxxxxxx>
Subject: Re: Patch: connection-tracking with LVS-NAT
Cc: <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Mon, 28 Jan 2002 14:04:44 +0200 (EET)
        Hello,

On Mon, 28 Jan 2002, Ben North wrote:

> Julian Anastasov wrote:
> > Ben North wrote:
> > > [The attached] patches [to allow connection-tracking to be
> > > used for LVS-NAT connections] might be better applied to a
> > > Linux-kernel-style 1.1 "development" branch.
> >
> >     Yes, we should consider many things. As for your work on these
> > patches I find it very interesting.
>
> Thanks.  Performing stateful firewalling of LVS connections was
> something we wanted to do, and if it's useful to other people as
> well, so much the better.  (And of course we must comply with
> the GPL licensing terms of the original LVS.)
>
> > But there is one problem: we are stuck with the current design
> > of netfilter, we need some requirements for the routing, we
> > can't say we like the way the players in the hooks are
> > ordered, we even don't like the hooks.  And we don't hope
> > something will change in the kernels just to make LVS happy.
>
> True!  I must admit I haven't looked too closely at the 0.9 LVS
> code, but the 0.8.2 code seemed to integrate fairly cleanly with
> the Netfilter hook system.

        The hooks are same in 0.9 but many things are changed
after 0.8.2. Even now we fix some things before 1.0

> > About the netfilter: yes, there is stateful conntracking,
> > the routing is used almost correctly, with some problems when
> > using multipath routes (which you and netfilter are trying to
> > address with route_me_harder).
>
> There might be more complicated routing situations where the
> relatively simple route_me_harder() approach doesn't work.  I

        No, it works everywhere but is slowww. Note that netfilter
reroutes only for localout. The only current solution for
netfilter and NAT through multiple gateways is may be the
solution posted from Henrik Nordstrom. But it uses nfmark,
not good for LVS.

        Also, from the list of exported routing functions
ip_route_output can solve the problem with multiple gateways,
as you are trying to do (rerouting). For Netfilter I created
one solution:

http://www.linuxvirtualserver.org/~julian/rtlsrc-2.4.17-1.diff

        LVS will need more in its new design. OTOH, I have better
ideas for faster and correct routing usage from connection
tracking but nobody listens. I'll try to explain them in a new
LVS design document, in the next days. I'm now working on it.
But the rtlsrc functionality is a key in this usage.

> didn't look into that too deeply, I'm afraid, because the
> approach I took worked for our situation.  Somebody who knows
> more could probably come up with a better solution.

        Yes, I'm thinking and thinking but don't know how to
use the netfilter's conntracking from LVS, for everything, not
only for NAT. We can use everything but conntrack+NAT: we can
use qos, fwmarking, firewall, only conntrack+NAT is a problem.

> > LVS has [...] its own slow timer support to offload the kernel
> > timer lists.
>
> As I mentioned in the README, I suspect there may be a race with
> this code but sadly didn't have time to get to the very bottom
> of it.  As far as I could tell, although there is locking on

        Yes, there are some races, we are now trying to fix them,
I'm not sure in which version they will be integrated. IMO, they
must be fixed before 1.0 but we better to be sure they are
working correctly.

> manipulation of the timer lists themselves, there is no locking
> for the following two actions: (1) the timer code expiring a
> connection-tracking entry, and (2) a packet arriving and
> updating a connection-tracking entry's time-to-expiry.  If the
> second happens in the middle of performing the first, you get
> inconsistent results.  That's my guess anyway; maybe somebody
> who knows more about the code can look at this?

        Yes, the rules should be something like this:

- entry can be deleted from timer_bh (sltimer list),
dropentry (timer_bh), ip_vs_in (softirq, dest unavail), user
space (conn_flush)

- the entry must be deleted only from one thread. Only this
thread can del_sltimer. All packet handlers should use a new
function upd_sltimer (add_sltimer only after succeding detach), not
mod_sltimer. If the deleting threads detects busy entry, someone
refers to it (refcnt, n_control) then don't unhash the entry,
call add_sltimer

- etc, there are 15-20 rules which I already don't remember, you
will see the change soon

> Regards,
>
> Ben.

Regards

--
Julian Anastasov <ja@xxxxxx>



<Prev in Thread] Current Thread [Next in Thread>