On Sat, 25 Mar 2000, Wensong Zhang wrote:
> On Fri, 24 Mar 2000, Julian Anastasov wrote:
> > > We cannot resurrect entries for LVS/NAT, because we cannot get more
> > > information from packets from the real servers, we don't know which
> > > virtual service the services belong to (the IP address and port number of
> > > the virtual service).
> > Do we support same raddr,rport for many virtual services?
> > If this is true, we really can't restore the virtual service. But
> > it is useless to add one real service to many virtual services
> I don't know whether there is such an application, but there is no problem
> to add one real service to many virtual services for VS/NAT. ;-)
Yes, currently it is possible. But I don't know why it can
be useful :)
> > for VS/NAT. For VS/DR and VS/TUN it is useful a real service to
> > belong to many virtual services. Yep, there is a local node feature
> > too that must be considered when we drop entries. May be it is
> > better not to drop IP_MASQ_F_VS_LOCALNODE entries.
> Maybe it is not good, if we don't drop IP_MASQ_F_VS_LOCALNODE entries when
> it is under attack, there may be a large number of IP_MASQ_F_VS_LOCALNODE
> entries in SYN_RCV state.
> > But for VS/NAT we receive packet from the real service,
> > i.e. saddr=raddr, sport=rport, daddr=caddr, dport=cport. We
> > can search the real server by saddr/sport. If we reorganize
> > the tables we can achieve that.
> > It can't work for ftp sessions, i.e. we can't
> > resurrect them if we can't find the service. When we resurrect
> > the entries in ip_fw_masquerade if the packet doesn't belong
> > to a real service (or MASQ) we can drop it.
> > In fact, the MASQ can't resurrect entries but LVS/NAT
> > can: if the v/r service still exist.
> Since the entries needs resurrecting in LVS/NAT may be just a tiny portion
> of the whole things, maybe that the load balancer can simply send ICMP
> packet to the real server that client is not reachable, and the server can
> collect back resources quickly. For normal users of those entries, we are
> sorry that their connections are broken because we are under attack, then
> they need to establish the connections again and they should have
> probability to access the service. It is simple to handle.
Yes, may be we can notify the real server from ip_fw_masquerade.
But only if we drop entries after passing the packets. Read the
> > be fooled to set the ES state. But currently, MASQ can be fooled
> > by 3th "client" to:
> > - set the state to SR or ES (flood attacks)
> > - set the state to CW/CL via FIN/RST (hijacking, even not from
> > a man-in-the-middle)
> > This is because the MASQ box checks only the flags and
> > not the protocol data. We can at least check the flags from
> > the real server but this leads to delayed transitions to ES state:
> > we can stay in SR if there is no data transfered, f.e. the last
> > SR/SS->ES state changes.
> Yeah, it is true. The MASQ box just checks the flags (SYN, ACK, RST and
> FIN) to do TCP state transition, without checking the sequence number. It
> is vulernable under the SYN following ACK attack.
> For VS/NAT, we may record the sequence number of SYN+ACK packet from real
> server, then check the sequence number of ACK packet from client before
> entering the ES state. Or, we may delay transition to the ES state until
> data is transfered, it is fit for the TCP finite state machine, but it
> might work.
I think, it is better to delay the transition to ES state.
> However, for VS/TUN and VS/DR, the load balancer is on the
> client-to-server half connection, it cannot get the sequence number of
> SYN+ACK packet from real server like that in VS/NAT, and it cannot delay
> transition to the ES state. So, it is still vulernable under the SYN
> following ACK attack.
Someone to help VS/DR and VS/TUN, please. We can't :)
We can only drop SYN packets without passing them, I think.
> So, what is the solution applying to all the situations? I think that
> maybe we can combine dropping entries and dropping 1/rate packets (you
> proposed) together, just in order to let system have memory for new
> connections. Anyway, the more memory the box has, the better. ;-) And, we
> can tell users to use "ipchains -M -S ..." to set the possible small
> values too. ;-)
It seems that we can't drop entries in SR state after passing
the SYN packet to the real server. Currently, when the real server answers
with SYN+ACK ip_fw_masquerade() creates new entry with ip_masq_new().
Is that good?
In fact, if we start to drop entries (any kind), we have
to modify ip_fw_masquerade to check if the packet comes from our
real server. Else, we start to create MASQ entries to the real
servers with mport=MASQ_port (first free>61000) which is never used.
I.e. we create zombie entries in SS or ES state. May be we can notify
the real server here but only if we support uniq real services under
VS/NAT, i.e. one raddr/rport to be used from one virtual service.
May be we can just drop packets coming from our real service.
Usually the SYN packets are retransmitted if not answered
soon. This is a very bad situation for VS/DR and VS/TUN methods. If
we drop SR entry for a real server after passing the packet, the next
SYN is send to another real server and the client is confused from
two different SYN+ACK packets/cookies coming from two real servers.
May be the SYN packets must be dropped without passing them to
the real server, i.e. by using a drop rate.
For the resurrection of the entries. The only problem is that
we don't know when to drop the ES entries. We are not sure if the
real server will ACK soon. It is possible the connection to freeze.
Currently, I see 3 working modes as useful for VS/NAT (for
example, via ip_vs_defense_level):
mode 0 - default mode
No packets are dropped.
Under load we switch automatically to mode 1 and then back
to mode 0 when the system is not busy
mode 1 - we are in dangerous area
A> We start to ACK the connection setup
May be when there is less than 10MB left (or configured
We have to use other timeouts and states (tables):
We have to wait 10 seconds for example in SR state.
When/if the real server replies with SYN+ACK we switch
to a new state SA (abbreviated from SYN+ACK). If the real
server doesn't use SYN cookie protection we don't see this
SYN+ACK and the entry is dropped after 10 seconds. So,
we expect SYN+ACK from the real server for 10 seconds. This
is our support for all kinds of OS-es which doesn't
support SYN cookies, i.e. when they just ignore the extra
SYNs when their backlog is full. In fact, this is not a bad
mode for the real server if it is overloaded. But may be
the SYN cookie support is still preferred.
The timeout for the new SA state can be 60 seconds,
same as the old SR state. Or 75? The rule here is that we
must stay in SA state until the ACK is received from the real
server to allow the transition to ES state. We can't trust the
client, so we can't switch to ES after its ACK. This is OK for
the most of the services.
B> We start to drop SYN packets using rate and without passing
them to the real server.
Yep, if the above protection doesn't work it is a
time to switch to a faster Director. Buy more RAM, to feed your
real servers. They accept more connections than the Director
mode 2 - This is same as mode 1 but when set from the user,
LVS can't return automatically to mode 0. Very useful when
the user thinks that he is permanently under attack or just
For the BUGS:
ip_fw_masquerade() incorrectly continues to send the packet after
ip_route_output() is failed. This is a recent MASQ bug. We must
return -1; and not to use the default gateway with
inet_select_addr(). We have to drop this packet, may be the routing
cache needs tunning, so don't try to send this packet.
Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>