I'm forwarding the report from
https://bugzilla.netfilter.org/show_bug.cgi?id=1669 here, since it was
pointed out there, that this list would be more appropriate.
When using an ipvs service in combination with SNAT and a NOTRACK rule,
specific circumstances can lead to TCP ports of packets being changed
mid-stream, which results in successful connections that no data can be
effectively sent over.
Consider the following example:
root@router:~# sysctl net.ipv4.vs.conntrack
net.ipv4.vs.conntrack = 1
root@router:~# iptables -t raw -L -n -v
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
24 1296 CT tcp -- enp1s0 * tcp dpt:1234 NOTRACK
root@router:~# iptables -t nat -L -n -v
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
4 240 SNAT all -- * * to:
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
root@router:~# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP rr
-> Masq 1 0 0
-> Masq 1 0 0
The reals servers are running
socat TCP4-LISTEN:1234,fork 'EXEC:sh -c
We dump the network traffic between the router and client on the ipvs
router as follows:
root@router:~# tcpdump -pXXni enp1s0 icmp or tcp -w
tcpdump: listening on enp1s0, link-type EN10MB (Ethernet), capture size
262144 bytes
^C16 packets captured
16 packets received by filter
0 packets dropped by kernel
While the capture is running, we run the following commands on a client
to trigger the buggy behavior:
root@debian:~# netcat -p 4321 -v 10.0.01 1234
Connection to 10.0.01 1234 port [tcp/*] succeeded!
root@debian:~# sleep 60
root@debian:~# netcat -p 4321 -v 10.0.01 1234
Connection to 10.0.01 1234 port [tcp/*] succeeded!
We can see that on the first connection attempt we successfully receive
a reply with payload from the server and then terminate the connection
with Ctrl+C. Then we wait 60 seconds, which is necessary for the
previous connection to move out of the TIME_WAIT state. Afterwards we
open another connection, reusing the same src port as on the first
connection and don't receive a reply from the server. The captured
traffic shows, that after the three-way handshake for the second TCP
connection, packets from the router to the clients use another server
port than the one used for the initiation of the connection.
Description: application/vnd.tcpdump.pcap