Hi Mark,
Excellent problem report!
We have a load balancer (with the lvs kernel stuff) at 100.1.1.1, with a
second IP address 100.1.1.2.
We have two mail servers at 120.1.1.1 and 120.1.1.2. The load balancer
is supposed to balance connections between the two mailservers. We have
another load balancer at 130.1.1.1 which works fine, but the new load
balancer is set up seemingly the same and yet it just does not work.
Load balancer configuration (100.1.1.2)
===========================
net.ipv4.conf.all.forwarding = 1
eth0 has 100.1.1.1, eth0:0 has 100.1.1.2
And their netmasks are 24, resp. 32?
# ipvsadm --list -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 100.1.1.2:25 wlc
-> 120.1.1.1:25 Tunnel 1 0 0
-> 120.1.1.2:25 Tunnel 1 0 0
iptables has no rules and is default-to-accept. There is no firewall in
front of the box.
Mail server 1 (120.1.1.1)
=================
relevant iptables rules:
$IPTABLES -A INPUT -i eth0 -s 100.1.1.2 -p ipencap -j ACCEPT
$IPTABLES -A INPUT -i tunl0 -p tcp --dport smtp -j ACCEPT
Why do you need those rules if you're not having any netfilter rules and
a ACCEPT policy?
Mail daemon listening on all IPs:
# netstat -natp |grep TEN |grep 25
tcp 0 0 0.0.0.0:25 0.0.0.0:*
LISTEN 14505/exim4
Excellent.
tunl0:0 is the tunnel interface for the existing load balancer (that works)
# ifconfig tunl0:0
tunl0:0 Link encap:IPIP Tunnel HWaddr
inet addr:130.1.1.2 Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1
tunl0:3 is the tunnel interface for the new load balancer that doesn't work
# ifconfig tunl0:3
tunl0:3 Link encap:IPIP Tunnel HWaddr
inet addr:100.1.1.2 Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1
Mail server 2 (120.1.1.2)
=================
Same as mailserver 1
The current load balancer at 130.1.1.1 uses 130.1.1.2 for load balancing
inbound smtp/25 connections to the two mailservers. If i telnet to
130.1.1.2 from my work machine at 140.1.1.1, this is the tcpdump sequence:
I'm a bit confused by your obfuscation technique :), what's the
designation for the servers regarding the obfuscated IP ranges in
100.x.x.x, the 120.x.x.x, the 130.x.x.x and the 140.x.x.x?
140: your test machine
130: working LVS tunnel
120: RS (mail server)
100: new (non-functional) LVS tunnel
Is my observation correct?
load 1
10:28:34.904382 IP 140.1.1.1.3948 > 130.1.1.2.25: S
3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
10:28:34.909107 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
10:28:55.134362 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
mail 2 eth0 (this could be mail 1 here, in the example the connection
was passed to mail 2)
10:28:34.583491 IP 130.1.1.2.25 > 140.1.1.1.3948: S
151923592:151923592(0) ack 3712043867 win 5840 <mss 1460,nop,nop,sackOK>
10:28:54.608731 IP 130.1.1.2.25 > 140.1.1.1.3948: P 1:52(51) ack 1 win 5840
mail 2 tunl0
10:28:34.583459 IP 140.1.1.1.3948 > 130.1.1.2.25: S
3712043866:3712043866(0) win 65535 <mss 1260,nop,nop,sackOK>
10:28:34.588206 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 151923593 win 65535
10:28:54.813191 IP 140.1.1.1.3948 > 130.1.1.2.25: . ack 52 win 65484
This tcpdump shows a full tcp connection via the working load balancer
to one of the mail servers, in this case mail 2. The SMTP servers are
configured to pause for 20 seconds before showing their banner, which
accounts for the delay between the packets.
So this works perfectly, as shown above, which actually indicates that
you have at one point got LVS to work. Sidenote: Your LVS seems to be a
bit out of sync regarding time; otherwise your trace looks odd.
Now, if I try the same thing but telnet to 100.1.1.2:25 (the new load
balancer), the connection times out. tcpdumps show:
Care to show the whole ipvsadm -L -n output? Or is the one above
representative enough to display the problem?
load balancer
=========
11:01:48.231327 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
11:01:51.195252 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
11:01:57.230423 IP 140.1.1.1.4042 > 100.1.1.2.25: S
1821058469:1821058469(0) win 65535 <mss 1260,nop,nop,sackOK>
Indicates a routing or network configuration issue.
Both of the mail servers show no traffic whatsoever on eth0 or the
tunnel interface.
Looks like the scheduler is not invoked or the packet does not match the
configuration.
On the load balancer, /proc/sys/net/ipv4/vs/debug_level is set to 9 and
the follow messages were observed in syslog:
Excellent:
Mar 29 11:01:48 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 not hit
Mar 29 11:01:48 dev1 kernel: IPVS: lookup service: fwm 0 TCP
100.1.1.2:25 hit
Now this is very very weird. The normal TCP service lookup did not
succeed, although it should have, but the FWM TCP service lookup did.
Are you sure that:
a) You have cleanly shutdown (rmmod ip_vs if necessary) IPVS between
the functional and the non-functional test conduct?
b) You have no iptables or iproute2 rules indicating firewall marks?
c) You have no port 0 service set up?
Mar 29 11:01:48 dev1 kernel: IPVS: ip_vs_wlc_schedule(): Scheduling...
Mar 29 11:01:48 dev1 kernel: IPVS: WLC: server 120.1.1.1:25 activeconns
0 refcnt 1 weight 1 overhead 0
Mar 29 11:01:48 dev1 kernel: IPVS: Bind-dest TCP c:140.1.1.1:4042
v:100.1.1.2:25 d:120.1.1.1:25 fwd:T s:0 conn->flags:182 conn->refcnt:1
dest->refcnt:2
Mar 29 11:01:48 dev1 kernel: IPVS: Schedule fwd:T c:140.1.1.1:4042
v:100.1.1.2:25 d:120.1.1.1:25 conn->flags:1C2 conn->refcnt:2
This looks like it would happily send it.
Mar 29 11:01:48 dev1 kernel: IPVS: TCP input [S...]
120.1.1.1:25->140.1.1.1:4042 state: NONE->SYN_RECV conn->refcnt:2
Ok, we do the state transition indicating that we've allocated the
connection structure for the hash table entry.
Mar 29 11:01:51 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 hit
Second SYN as seen in your non-functional tcpdump trace.
Mar 29 11:01:57 dev1 kernel: IPVS: lookup/in TCP
140.1.1.1:4042->100.1.1.2:25 hit
Third SYN as seen in your non-functional tcpdump trace.
Mar 29 11:02:04 dev1 kernel: IPVS: Unbind-dest TCP c:140.1.1.1:4039
v:100.1.1.2:25 d:120.1.1.2:25 fwd:T s:3 conn->flags:182 conn->refcnt:1
dest->refcnt:2
This is not belonging to the trace above since it's port 4039 which must
have been a test performed before you took the trace. Most likely this
one ran into the normal 60 sec timeout.
I really am at a loss as to why this doesn't work, the debug log seems
to show IPVS passing traffic to mail 1 (120.1.1.1) however the tcpdump
for that server shows absolutely nothing. If anyone can point me in the
right direction here I would be very grateful.
Can you show your routing information on your LVS? As well as the tun*
device configuration in the proc-fs?
Best regards,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|