LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: random SYN-drop function

To: Wensong Zhang <wensong@xxxxxxxxxxxx>
Subject: Re: random SYN-drop function
Cc: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 24 Mar 2000 07:27:39 +0200 (EET)
        Hello,

On Thu, 23 Mar 2000, Wensong Zhang wrote:

> > B>>> Resurrecting entries: the new dream for the VS/NAT mode
> >
> >     For this, ip_fw_masquerade() must be patched
> > too (I didn't implemented it before we decide what to do).
> > LVS must resurrect the entry just like the MASQ, i.e.
> > ip_masq_new_vs() must be called with all consequences:
> > templates, etc. If we drop entries for which we are not sure
> > about the state (may be we drop entries in ES state in the real
> > server) we must have a way to resurrect them. I.e. the
> > OUTPUT(NO->ES via ACK). We rely on the real servers state and
> > if we drop entries in SYN_RECV state we know that the possible
> > state is ES. When we see ACK from the real server we can assume
> > that it is normal data ACK (not ACK for FIN). I.e. if we don't
> > drop entries in FW/TW state we know that the ACK from the real
> > server is about the established state. So, we can create the
> > entry in ES state (just like the MASQ). The OUTPUT table for NO
> > state is correct: we resurrect the entries in SS, TW, ES or CL
> > state according to the flags. We must implement it for LVS/NAT.
> >
>
>
> We cannot resurrect entries for LVS/NAT, because we cannot get more
> information from packets from the real servers, we don't know which
> virtual service the services belong to (the IP address and port number of
> the virtual service).

        Do we support same raddr,rport for many virtual services?
If this is true, we really can't restore the virtual service. But
it is useless to add one real service to many virtual services
for VS/NAT. For VS/DR and VS/TUN it is useful a real service to
belong to many virtual services. Yep, there is a local node feature
too that must be considered when we drop entries. May be it is
better not to drop IP_MASQ_F_VS_LOCALNODE entries.

        The services and the real servers are not changed very
often. We can add additional hash table hashed by raddr/rport.
struct ip_vs_dest can use back pointer to the struct ip_vs_service.
By this way we can check saddr/sport against the table with the
real services and then to reach the vservice. If the source validation
check is relaxed for VS/DR and VS/TUN we can't restore raddr but we
know vaddr,vport and rport. Of course, these packets don't reach
ip_fw_masquerade. May be we can find raddr by looking in
the neighbour cache but we can trigger more problems here.

        But for VS/NAT we receive packet from the real service,
i.e. saddr=raddr, sport=rport, daddr=caddr, dport=cport. We
can search the real server by saddr/sport. If we reorganize
the tables we can achieve that.

        It can't work for ftp sessions, i.e. we can't
resurrect them if we can't find the service. When we resurrect
the entries in ip_fw_masquerade if the packet doesn't belong
to a real service (or MASQ) we can drop it.

        In fact, the MASQ can't resurrect entries but LVS/NAT
can: if the v/r service still exist.

>
>
> >     What is the consequence if we can resurrect the entries:
> > we can drop even entries in established state. But when?
> > Only when we expect reply from the real server from which we
> > create the new entry. Currently, we rely on real servers ACK
> > just after the handshake. This is true when the traffic is
> > started just after the connection (most of the services:
> > http, ftp, etc.). I.e. we think we drop entries in SYN_RECV
> > state but it can be established. In this case we rely on the
> > first ACK from the real server. Why not on the next ACKs too?
> > So, we can drop entries in ES state too.
> >
>
>
> We need fix the INPUT table too, switching from SYN_RECV state to EST
> while receiving ACK from the client.

        If we can resurrect entries we can restore the old
behavior (before 0.9.9). Else, we can't follow the client. We
can be flooded with two packets: SYN and ACK: this is worse,
we timeout for 15 minutes while the SYN attack timeouts after
1 minute. Yes SR->ES on client's ACK is correct but we don't
know if the connection is accepted in the real server. We can
be fooled to set the ES state. But currently, MASQ can be fooled
by 3th "client" to:

- set the state to SR or ES (flood attacks)
- set the state to CW/CL via FIN/RST (hijacking, even not from
a man-in-the-middle)

        This is because the MASQ box checks only the flags and
not the protocol data. We can at least check the flags from
the real server but this leads to delayed transitions to ES state:
we can stay in SR if there is no data transfered, f.e. the last
SR/SS->ES state changes.

> > For example (VS/NAT 0.9.9):
> >
> >     CLIENT          SERVER  SERVER STATE            NEW MASQ STATE
> >
> > 1.  SYN     ------>                                 => SR
> > 2.          <------ SYN+ACK SYN_RECV                => SR
> > 3.  ACK     ------>         EST                     => SR
>
> I think that we need fix the INPUT table here.

        We can but this is dangerous - ACK after SYN attack.

> I think that we better verify the whole INPUT and OUTPUT state transition
> tables for both IP Masquerading and IPVS.
>
> /*    INPUT */
> /*      mNO, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mLI      */
> /*syn*/       {{mSR, mES, mES, mSR, mSR, mSR, mSR, mSR, mSR, mSR }},
> /*fin*/       {{mCL, mCW, mSS, mTW, mTW, mTW, mCL, mCW, mLA, mLI }},
> /*ack*/       {{mCL, mES, mSS, mES, mFW, mTW, mCL, mCW, mCL, mLI }},
> /*rst*/ {{mCL, mCL, mCL, mSR, mCL, mCL, mCL, mCL, mLA, mLI }},
>
> /*    OUTPUT */
> /*      mNO, mES, mSS, mSR, mFW, mTW, mCL, mCW, mLA, mLI      */
> /*syn*/       {{mSS, mES, mSS, mSR, mSS, mSS, mSS, mSS, mSS, mLI }},
> /*fin*/       {{mTW, mFW, mSS, mTW, mFW, mTW, mCL, mTW, mLA, mLI }},
> /*ack*/       {{mES, mES, mSS, mES, mFW, mTW, mCL, mCW, mLA, mES }},
> /*rst*/ {{mCL, mCL, mSS, mCL, mCL, mTW, mCL, mCL, mCL, mCL }},
>
> Is it right now?

        It is right. I see only the "SR->ES after client's ACK" change
after 0.9.9. But we are still vulnerable to a SYN packet and a following
ACK packet from the client.

        In the next days I'll try to implement and to test a
resurrection for the VS/NAT entries (when not busy).


Regards

--
Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>