Hello,
I am looking for some help with an intermittent issue with IPVS source IP
selection in TUNNEL mode. We run LVS directly in a Kuberenetes cluster,
potentially on the same machine as backend workload so in many cases the LVS
Director is also a real server. Keepalived runs in active/passive on a pair
of VMs, working to maintain a VIP on the public interface for some level of
HA. We periodically notice a strange selection of source IP for IPinIP
tunnelled traffic coming out of the LVS when using source hashing (sh)
scheduling.
This problem has been noticed in the following environment:
- Virtual Machines running Red Hat Enterprise Linux
- Kernel version 3.10.0-1062.18.1.el7.x86_64
- IPVS version 1.2.1 with TUN and source hash (sh) scheduling.
- OpenShift 4.3
- OpenShift 3.11
An example from a 3 node Kubernetes cluster:
- 10.221.95.10
- 10.221.95.2
- 10.221.95.5
with Linux director running on 10.221.95.2 and virtual service directing
traffic to:
- localhost
- remote node 10.221.95.5
The local endpoint/real server is Direct Routing. The remote real server is
Tunnel.
LVS Configuration:
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 169.46.4.90:80 sh
-> 10.221.95.5:80 Tunnel 1 0 0
-> 127.0.0.11:80 Route 1 1 0
/ # ipvsadm -Lnc
IPVS connection entries
pro expire state source virtual destination
TCP 14:44 ESTABLISHED 128.92.120.147:56212 169.46.4.90:80 127.0.0.11:80
Traffic moving through the LVS and remaining on the same node gives
successfull connection establishment. The problem arises when traffic from a
particular source IP is selected to transit by tunnel to a remote real
server.
EXPECTED BEHAVIOR: IPVS encapsulates the traffic with IPinIP using the IP
address from the private interface of the VM (10.X.X.X). Example traffic
successfully balanced from LVS director VM 10.221.95.2 to remote real server
10.221.95.5:
# tcpdump -n -i eth0 host 10.221.95.2 and proto 4
13:58:28.151571 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [S], seq 180302151, win 65535, options [mss
1460,sackOK,TS val 590414746 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
13:58:28.152447 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [.], ack 2964164084, win 128, options [nop,nop,TS val
590414747 ecr 89050127], length 0 (ipip-proto-4)
13:58:28.152467 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [P.], seq 0:75, ack 1, win 128, options [nop,nop,TS
val 590414747 ecr 89050127], length 75: HTTP: GET / HTTP/1.1 (ipip-proto-4)
13:58:28.154037 IP 10.221.95.2 > 10.221.95.5: IP 52.117.148.54.64369 >
169.46.4.90.80: Flags [.], ack 723, win 131, options [nop,nop,TS val
590414749 ecr 89050129], length 0 (ipip-proto-4)
NOTE: The above trace was grabbed after finding a way around the issue (see
below) and depicts only inbound traffic from the LVS. DSR carries the
response back to the client out eth1.
OBSERVED BEHAVIOR: IPVS mysteriously encapsulates traffic with source IP
from 127.X.255.255. Running tcpdump from the remote real server
(10.221.95.5):
# tcpdump -n -i eth0 net 127.0.0.0/8 and proto 4
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:43:34.065782 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539120382 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:35.065967 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539121383 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:37.082042 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539123399 ecr 0,nop,wscale 9], length 0 (ipip-proto-4)
23:43:41.306020 IP 127.138.255.255 > 10.221.95.5: IP 52.117.148.54.3595 >
169.46.4.90.80: Flags [S], seq 146570019, win 65535, options [mss
1460,sackOK,TS val 539127623 ecr 0,nop,wscale 9], length 0 (ipip-proto-4
One can see that the arriving IPinIP tunneled traffic has a source IP of
127.138.255.255. This is NOT expected. This is accompanied by kernel logs
like:
kernel: IPv4: martian source 10.X.X.X from 127.X.255.255, on dev eth0
Why is IPVS selecting this source IP for tunnelled traffic instead of the IP
from the private interface (eth0) of the VM?
We have noticed that the problem can be resolved by the following:
- trigger LB failover (keepalived moves VIP to new node and IPVS needs
reprogramming)
- create a new LB (again IPVS needs to be programmed to include the new
virtual service)
I understand several factors are at play here and I will continue to try and
isolate, but any insight on IPVS selection of source IP would be much
appreciated.
Best,
Calvin
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|