LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: random SYN-drop function

To: Wensong Zhang <wensong@xxxxxxxxxxxx>
Subject: Re: random SYN-drop function
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx, cluster-list@xxxxxxxxxx
From: Ratz <ratz@xxxxxx>
Date: Thu, 16 Mar 2000 13:32:21 +0100
Hi,

first off: thank you very much for all the replies. I've been offline
for several weeks but I'm still in this mailing list.

Wensong Zhang wrote:
> 
> On Wed, 15 Mar 2000, Ratz wrote:
> 
> > I cannot get the point out of your new ip_vs_random_drop_syn function.
> > At which point could such a function be important? Or what exactly has
> > to occur that you want to drop a connection?
> > I mean, standard (without ISN-prediction) SYN flooding is IMHO not
> > possible to a 2.2.14 kernel unless you set the
> > /proc/sys/net/ipv4/tcp_max_syn_backlog to a too high value.
> >
> > Please, could you enlighten me once more?
> >
> 
> Yeah, syncookie in the kernel 2.2.x can help tcp connection avoid syn
> flooding attach, I mean that it is work on the TCP layer. However, IPVS is
> working on IP layer, each entry (marking connection state) need 128 bytes
> effective memory. Random Syn-drop is to randomly drop some syn entry
> before running out of memory. It may help IPVS box survive even under a
> big distributed syn-flooding attach, but real servers still need setup
> syncookie to prevent themselves from syn-flooding attack.

Didn't exactly get this one. AFAIK syncookies don't prevent from
synflooding, but still allow connections to syn-flooded nodes. But
you're right, that realservers should enable syncookies (currently only
Linux, FreeBSD & Solaris) to still handle incoming requests. What I
didn't get is the affinity between the two kernel tables. From what I
understood, there is the normal tcp_state kernel (tcp-layer) where
'normal' connections (stats) will be inserted and there is your table
(which must be between IP and TCP Layer, since its elements also contain
tcp_status [SYN_RECV, ES]) which will insert all stats of connections to
the VIP. So how must packets be constucted to simulate a SYN-Flood.
Packets with SYN=1 sent to the VIP does not work. 
I really would appreciate to generate, together with your help, a
flowchart of the whole tcp-connection. Let me start [thankx to Joe for
the picture in the LVS-HOWTO :) ]. LVS-DR, sched=rr, weight S#=1,
http-GET-Request!

                        _______
                       |       |
                       |   C   | CIP
                       |_______|
                           |
                           |
                        ___|___
                       |       |
                       |   R   |
                       |_______|
                           |
                           |
                           |       __________
                           |  DIP |          |
                           |------|    LB    |
                           |  VIP |__________|
                           |
                           |
                           |
         -------------------------------------
         |                 |                 |
         |                 |                 |
     RIP1, VIP         RIP2, VIP         RIP3, VIP
    ____________      ____________      ____________
   |            |    |            |    |            |
   |     S1     |    |     S2     |    |     S3     |
   |____________|    |____________|    |____________|


C=Client, R=Router, S#=Realserver #, LB=Loadbalancer, ac=active
connections, ic=inactive connections,


        C               (R)             LB              S1      TCP_STATE(LB)   
ac   ic 
1+2)   CIP -----------SYN------------> VIP ----SYN----> RIP1     
SYN_RECV       1    0
3)     CIP <-------------------SYN/ACK----------------- RIP1
4+5)   CIP -----------ACK------------> VIP ----ACK----> RIP1     
ESTABLISH      1    0

ok, lets start sending real data

6)     CIP -----------ACK------------> VIP ----ACK----> RIP1     
ESTABLISH      1    0
...

So, now, we are finished, and want to close the connection. First
problem: IMHO the loadbalancer is not able to distinguish between active
close on the server side and active close and the clients side. This
leeds to two final close sceneries (without SACK):

active close on server side
===========================

1)     CIP <---------------------FIN------------------- RIP1     
ESTABLISH      1    0
2+3)   CIP ---------ACK--------------> VIP ----ACK----> RIP1     
ESTABLISH      1    0
4+5)   CIP ---------FIN--------------> VIP ----FIN----> RIP1 
CLOSE_WAIT/CLOSED? 0    1 ?
6)     CIP <---------------------ACK------------------- RIP1 
CLOSE_WAIT/CLOSED? 0    1 ?

how does the lb know when he has to switch from CLOSE_WAIT to CLOSED? Or
does he just switch to CLOSED?

active close on client side
===========================

1+2)   CIP ---------FIN--------------> VIP ----FIN----> RIP1     
CLOSE_WAIT?    0    1 ?
3)     CIP <---------------------ACK------------------- RIP1     
CLOSE_WAIT?    0    1 ?
4)     CIP <---------------------FIN------------------- RIP1     
CLOSE_WAIT?    0    1 ?
5+6)   CIP ---------ACK--------------> VIP ----ACK----- RIP1 
CLOSE_WAIT/CLOSED? 0    1 ?

I hope someone can help me with my confusion and that we can put this
chart into the HOWTO, so everybody can understand how the loadbalancer
is really working. What's missing? The whole IP_VS_MASQ_TABLE in the
ip-layer (according to Wensong), SYN-cookies, SYN-drop. I'd really like
to draw the whole functional chart but since I'm not sure mixing up the
whole stuff I want add more.

 
> >
> > BTW.: What are the plans for transfering the ip_vs_masq_table from one
> > kernel to another one in case of a failover of the loadbalancer? Is
> > there already some idea or whatever?
> >
> 
> I just thought an idea on transfering the state table, it might be good.
> We run a SendingState and a ReceivingState kernel_thread (daemons inside
> the kernel like kflushd and kswapd) on the primary IPVS and the backup
> respectively. Everytime the primary handles packets, it will put the
> change of state in a sending queue. The SendingState kernel_thread will
> wake up every HZ or HZ/2, to send the change in the queue to the
> ReceivingState kernel_thread through UDP packets, and clear the queue
> finally. The ReceivingState receives the packets and changes its own state
> table.
> 
> Since all is inside the kernel, it should be efficient, because the
> switching overhead between the kernel and the user space (both for the UDP
> communications and the read & write of those entries) can be avoided.
> 
> Any comments?

Sounds fair :)
No, sorry, as I'm not yet a kernel hacker, I unfortunately can't
contribute any good ideas to this subject. But I'm extremely interested
in all kind of information I'll get!
 
> Thanks,
> 
> Wensong


TIA to everybody for any replies regarding this subject and happy
tcpdumping,

Roberto Nibali, ratz

<Prev in Thread] Current Thread [Next in Thread>