Julian Anastasov wrote:
> Ben North wrote:
> > [The attached] patches [to allow connection-tracking to be
> > used for LVS-NAT connections] might be better applied to a
> > Linux-kernel-style 1.1 "development" branch.
>
> Yes, we should consider many things. As for your work on these
> patches I find it very interesting.
Thanks. Performing stateful firewalling of LVS connections was
something we wanted to do, and if it's useful to other people as
well, so much the better. (And of course we must comply with
the GPL licensing terms of the original LVS.)
> But there is one problem: we are stuck with the current design
> of netfilter, we need some requirements for the routing, we
> can't say we like the way the players in the hooks are
> ordered, we even don't like the hooks. And we don't hope
> something will change in the kernels just to make LVS happy.
True! I must admit I haven't looked too closely at the 0.9 LVS
code, but the 0.8.2 code seemed to integrate fairly cleanly with
the Netfilter hook system.
> About the netfilter: yes, there is stateful conntracking,
> the routing is used almost correctly, with some problems when
> using multipath routes (which you and netfilter are trying to
> address with route_me_harder).
There might be more complicated routing situations where the
relatively simple route_me_harder() approach doesn't work. I
didn't look into that too deeply, I'm afraid, because the
approach I took worked for our situation. Somebody who knows
more could probably come up with a better solution.
> LVS has [...] its own slow timer support to offload the kernel
> timer lists.
As I mentioned in the README, I suspect there may be a race with
this code but sadly didn't have time to get to the very bottom
of it. As far as I could tell, although there is locking on
manipulation of the timer lists themselves, there is no locking
for the following two actions: (1) the timer code expiring a
connection-tracking entry, and (2) a packet arriving and
updating a connection-tracking entry's time-to-expiry. If the
second happens in the middle of performing the first, you get
inconsistent results. That's my guess anyway; maybe somebody
who knows more about the code can look at this?
Regards,
Ben.
--
_______________________________________
Ben North Software Engineer
a n t e f a c t o t: +353 1 8586008
www.antefacto.com f: +353 1 8586014
181 Parnell Street - Dublin 1 - Ireland
|