LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [PATCH][RFC]: followup ...

To: Julian Anastasov <ja@xxxxxx>
Subject: Re: [PATCH][RFC]: followup ...
Cc: "lvs-users@xxxxxxxxxxxxxxxxxxxxxx" <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
From: Roberto Nibali <ratz@xxxxxx>
Date: Mon, 19 Feb 2001 14:02:29 +0100
Hi Julian,

>         I agree, some firewalling can be done before the balancer
> but when the normally looking traffic comes only the balancer knows
> for open/closed ports, related ICMP, etc. The main things

Unless you put a proxying firewall ;)

> you can do before the balancer are to avoid source address spoofing,
> some bad packets, may be some ICMP types? But the balancer can be
> attacked even with normal traffic. The request rate can be limited

Ok, Julian, let's make a real example. I try to set up some LVS
cluster with webserver and a normally configured firewall and you
try to flood it in a way that the service cannot be delivered 
anymore normally :). Any ISP that wants to give me temporary access
to his backbone?

>         Yes, we need NETLINK_LVS kernel socket or similar. I don't
> think that for netfilter will be easy but for LVS can be easier. If

The architecture he proposed to me was rather simple, actually he
had the same idea. He's doing it as a module that hooks into 
conntrack. There you have quite the same template structures for 
incoming connections just more of them ;)

> we use full state (yes, Netfilter has "Real statefull connection
> tracking") replication we can flood the the internal links. There

There we'd have to split the LVS-code into two sourcetrees.
Because doing connection tracking and replication is too much
to implement in kernel space for 2.2.x.

> are ideas the state replication to be implemented only for long
> living connections. And yes, we can use this universal transport
> for many things, not only for connection state replication.

[OT] I proposed him to make a general framework so that we don't
have to reinvent the wheel. I though that it should be possible
to register via a device all the template tables you want to
have synch'd and the module itself would be responsible to
create the appropriate NETLINK packets and to start/reset the 
timers in kernel space. [/OT]
 
>         We will need one admin to stay and to change the range :)
> Of course, the range you propose can be tuned once until the
> parameters are changed under attack. OK, the user space tool can
> change the values under attack :)

Grr, yes, I know you're right, I just don't want to accept the 
fact. :)
 
>         Yes, the user must select a backlog size value according to the
> connection rate, we don't want dropped requests even while not under

Oh, this sound very reasonable. How and where do you think this can
be implemented?

> attack. Of course, the SYN cookies help, for the OSes that support
> them. Not very much if our link is full with invalid requests because
> we can flood our output pipe too. But I don't know how often DDoS
> SYN attacks happen these days.

It's a O(N^3) proportion to the popularity :) I'd love to see
the snort logfiles of nasa.gov or nsa.com or some *.mil? Over here
we have this stupid "big brother" stuff broadcasted trough some
ACEdirector3 loadbalancers. Two hours after launch the RS were not
reachable anymore.
 
>         Agreed. drop_packet and RS limits are different things.
> The question is how efficient will be the RS limits but if they
> are option the users can select, I don't see a problem. That can

Good. That's what I did, see my example when announing it. ;)

> be an option just like the people use wlc for example - no
> guarantee for the real server load :) But while under attack the
> wlc is not affected (except if the flood is over one connection),
> the RS limits are. And this is the problem I see.

Yep, this is the problem. I have to do some more testing and
real life examples with existing customer projects (I'm happy
to have an LVS-cluster that works and not some unconfigurable
ACEdirector-X patchwork) and in my lab (Just finished this 
weekend)
 
>         Yes, these RS limits are a simple control we can add.
> And of course it will be used from many users. My doubts are related
> to the moment where all real server will disappear and will not
> accept more new connections. How fast we will increase these

I will investigate this. Could you just give me some proposals
on how to make different test setups, please? With enough time
I prepare some kernel with different options enabled and will
do some penetration tests.

> limits or will start scheduling connections to these real servers.
> It again appears to be a user space problem :)

Yes, this is definitely a user space problem, if you want to
make it dynamically. I proposed the statical approach. If I
do it dynamically, we have to introduce some more setsockopts,
don't we?
 
>         Yes but drop_packet can be activated when we see a very
> big connection rate that will occupy all the memory for connections
> in the director. If we don't run other user space software we
> can simply ignore the defense strategies and to leave the packets
> to be dropped after memory allocation error.

I have no experiences with this approach. Do I understand you 
correctly when I say: The defense level is set by the amount
of kmalloc'able pages in the kernel per skb?
 
>         Yes, may be we can imlpement a better mechanism that will
> allow the different options to be supported without hurting all
> users. Who knows, may be we can create more sockops? But the

Isn't that the case right now? The provided function of ipvsadm
is very sparse.

> > So the distributions can handle it. It can't be our task to
> > adjust the binary tool to every distro it's our task to keep
> > it clean and independant of any distro.
> 
>         This is true but it means thay have to put all features in?

No exactly, if there is a framework proposed by some distributor
that can be of use for everyone and that doesn't affect the rest
of the flow of LVS it should possible to include it.

> Currently, for LVS we have the following methods in hand:
> 
> - create new scheduler

I could think of a method for "defense strategies". Do you know about
the OOM-killer framework for kernel-2.4.x? There we have a general
hook like for creating a new scheduler and everybody that thinks he
has a great idea to improve the functionality of the structure can
add his code (like f.e. Thomas Proell did with the hashing scheduler).
A lot of people already proposed some patches for the OOM-killer and
so I could imagine a hook into LVS where you can register your own
defense strategy, so we can test them under different penetration
tests.
 
>         Total 1 methods to add new separated features (may be I'm missing
> something). The things can be very complex if one new feature wants
> to touch some parts of the functions in the fast path or in the user
> space structures. What can be the solution? Putting hooks inside LVS?

Yes, but I don't think Wensong likes that idea :)

> IMO, we already must think for such needs.

Yes, the project got larger and more reputation than some of us
initially thought. The code is very clear and stable, it's time
to enhance it. The only very big problem that I see is that it
looks like we're going to have to separate code paths one patch
for 2.2.x kernels and one for 2.4.x.
 
>         No doubts, there will be some nice features that can't be
> done in user space. And exactly these features are not used from
> other users. The example is the cp->fwmark support proposed from
> Henrik Nordstrom: we have a feature that is difficult to say it
> is for user space but that touches two parts: internal functions
> and adds another hook that can delay the processing for some

The problem with his patch is:
 static struct nf_hook_ops ip_vs_in_ops = {
         { NULL, NULL },
-        ip_vs_in, PF_INET, NF_IP_LOCAL_IN, 100
+        ip_vs_in, PF_INET, NF_IP_LOCAL_IN, -10
+};

> users. I'm not sure what will happen if we start to think in
> "hooks" just like netfilter. If that looks good in user space
> I'm not sure we can tell the same for the kernel space. Any
> ideas here, may be for new topic?

See above about hooks for defense strategies. But you're right
IMHO, there is not a lot you can put into kernel space since 
most of the stuff has to be done in userspace.
 
> > I also would like to hear from other people what experiences
> > they've made with DDoS and the way the LVS was working under
> > an attack. So far I've not seen more than an akademic proof
> > (doing some stress tests not reflecting real world example)
> > to the designed defense strategies. I think Anoush was working
> > on something too but I haven't heard of him since ages ;)
> 
>         Hm, it seems nobody has such problems :)))

:) no comment.
 
>         No, counter which is reset on state change. But this is
> another issue and I didn't started to think more about such things.
> May be will not :)

Isn't that the case for 2.4.x and conntrack already?
 
>         Yes, that defense can be connection state related, LVS is
> connection scheduler, though, not a packet scheduler.

Not yet ;)
 
>         Yes, job for the agents to represent the real server load
> in weights.

The biggest problem I see here is that maybe the user space daemons
don't get enough scheduling time to be accurate enough.
 
>         Yes, wlc is not my preferred scheduler when it comes to
> connections dealing with database :)

Tell me, which scheduler should I take? None of the existing ones
gives me good enough results currently with persistency. We have
to accept the fact, that 3-Tier application programmers don't 
know about loadbalancing or clustering, mostly using Java and this
is just about the end of trying to load balance the application
smoothly.
 
>         I don't think we need intelligent scheduler if we
> are talking about current set of information used from the LVS
> schedulers. Only the users know what kind of connections are
> scheduled and they can instruct an user space tool how to set the
> WRR weights according to the load.

See, the timeperiod of setting the weights and the resulting load
rebalance is just a relation of 1:100. If you try to adjust the
weights dynamically, you will see (for an average e-buiz application
framework with webserver and database) that you can never balance
it right in time. The good thing is, that even commercial load 
balancer can't do it.
 
> > > > A packetfilter, the router (most of use do have a CISCO, don't the?)
> > >
> > >         Yes, the question is how Cisco will know what packet rate
> > > overloads the real servers :)
> >
> > :) The router is in my example just configured to drop non-net related
> > packets and these are already enough (seeing the huge logfile that
> > comes every day.
> 
>         Yes, there are packets with sources from the private networks
> too :)

They are masqueraded and their netentity belongs to an interface 
which of course will not drop the packets :)
 
>         I hope other people will express their ideas about this
> topic. May be I'm too pedantic in some cases :) And now I'm talking
> without "showing the code" :) I hope the things will change soon :)

No, no, I also hope some other people join the discussion since
we both could be completely wrong (well, in your case I doubt ...)
 
Best regards,
Roberto Nibali, ratz

-- 
mailto: `echo NrOatSz@xxxxxxxxx | sed 's/[NOSPAM]//g'`


<Prev in Thread] Current Thread [Next in Thread>