Graeme Fowler wrote:
On Fri, 7 Jan 2005, Horms wrote:
On Thu, Jan 06, 2005 at 03:17:11PM +0200, Johan van den Berg wrote:
<snip>
Upon setting /proc/sys/net/ipv4/vs/debug_level to 7, I noticed that
every now and again, I get a "lookup/out" entry that states that a
connection from the virtual ip on port 80 to a client port on the client
IP was "not hit". This seems to confirm that the original SYN from the
client to port 80 on the virtual ip is not stored in the IPVS connection
table, and therefore the reply to the client IP is not handled by IPVS,
but rather iptables, which causes havoc.
Yes, that does seem to confirm that the entry is not in the IPVS
connection table, and thys falls back to iptables.
Bingo. I sent a few emails in October (LVS-NAT: intermittent problem with
responses from realservers) about this exact problem but it appears that Johan
has explained the problem in slightly clearer terms than I did :)
For clafiication, this is most likely occuring in ip_vs_out()
which is a netfilter hook that reverses IPVS-NATed packets,
and thus is specific to NAT, though I don't think your problem
neccessarily is.
I haven't used TUN or DR in this scenario, am using NAT for a webserver farm.
What I saw, in very simple terms, was that every now and again traffic leaving
the cluster would miss the LVS and be NATted to the "default" address of the
load balancer - this address being different from the VIP. I worked around it
by using SNAT rules to force all packets from specific sets of hosts to be
coming from their VIP. If any packets fall through LVS and give a "not hit",
they hit the netfilter rule instead so the connection doesn't break. Whilst
clearly not that desirable, it does work so I left well alone and stopped
thinking about it - until Johan posted the same problem.
Unfortunately I'm not really able to do a huge amount of debugging as the
system is in production, but I should be able to crank up
/proc/sys/net/ipv4/vs/debug_level to see what we get if necessary.
Graeme
Hi
I set the debug level up to 12 and waited. After about 2 hours and about
60mb later, my connection gave the same problem. Here are the log
entries around the area of my client IP and port. I'm afraid I do not
know which lines (other than the obvious ones containing other client
IPs) are really relevant, so here is more or less 20 lines of context
just to be safe I didn't miss anything (i have marked the ones that I
feel are related):
grep -A 20 -B 20 163.200.156.65:43393 kern.debug.log.save returns this:
Jan 7 10:28:27 ulweb2 kernel: IPVS: lookup/in TCP
196.31.212.138:22453->163.200.147.172:80 hit
Jan 7 10:28:27 ulweb2 kernel: IPVS: Incoming TCP
196.31.212.138:22453->163.200.147.172:80
Jan 7 10:28:27 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
Jan 7 10:28:27 ulweb2 kernel: IPVS: new dst 192.168.1.1, refcnt=3, rtos=0
Jan 7 10:28:27 ulweb2 kernel: IPVS: NAT to 192.168.1.1:80
Jan 7 10:28:27 ulweb2 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 818
Jan 7 10:28:27 ulweb2 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Jan 7 10:28:27 ulweb2 kernel: IPVS: lookup/out TCP
192.168.1.1:80->196.31.212.138:22453 hit
Jan 7 10:28:27 ulweb2 kernel: IPVS: O-pkt: TCP size=208
Jan 7 10:28:27 ulweb2 kernel: IPVS: Outgoing TCP
192.168.1.1:80->196.31.212.138:22453
Jan 7 10:28:27 ulweb2 kernel: Leave: ip_vs_out, ip_vs_core.c line 814
Jan 7 10:28:27 ulweb2 kernel: IPVS: lookup/in TCP
195.75.154.35:15132->163.200.147.19:80 hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: Incoming TCP
195.75.154.35:15132->163.200.147.19:80
Jan 7 10:28:28 ulweb2 kernel: IPVS: TCP input [S...]
192.168.1.14:80->195.75.154.35:0 state: NONE->SYN_RECV cnt:2
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
Jan 7 10:28:28 ulweb2 kernel: IPVS: NAT to 192.168.1.14:80
Jan 7 10:28:28 ulweb2 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 818
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/out TCP
192.168.1.14:80->195.75.154.35:15132 not hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: packet for TCP 195.75.154.35:15132
continue traversal as normal.
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/in TCP
163.200.156.65:43393->163.200.147.172:80 hit **** Incoming SYN packet
Jan 7 10:28:28 ulweb2 kernel: IPVS: Incoming TCP
163.200.156.65:43393->163.200.147.172:80 ****
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
****
Jan 7 10:28:28 ulweb2 kernel: IPVS: NAT to 192.168.1.1:80 **** Service
lookup, that NATs to the real ip
Jan 7 10:28:28 ulweb2 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 818
****
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_out, ip_vs_core.c line 646 ****
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/out TCP
192.168.1.1:80->163.200.156.65:43393 not hit **** and here is the server's
SYN/ACK response
Jan 7 10:28:28 ulweb2 kernel: IPVS: packet for TCP 163.200.156.65:43393
continue traversal as normal. **** IPVS passes on to netfilter
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/in TCP
196.35.75.216:24735->163.200.147.172:80 hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: Incoming TCP
196.35.75.216:24735->163.200.147.172:80
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
Jan 7 10:28:28 ulweb2 kernel: IPVS: NAT to 192.168.1.1:80
Jan 7 10:28:28 ulweb2 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 818
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/out TCP
192.168.1.1:80->196.35.75.216:24735 not hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: packet for TCP 196.35.75.216:24735
continue traversal as normal.
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/in TCP
196.15.202.161:26421->163.200.147.172:80 hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: Incoming TCP
196.15.202.161:26421->163.200.147.172:80
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
Jan 7 10:28:28 ulweb2 kernel: IPVS: NAT to 192.168.1.1:80
Jan 7 10:28:28 ulweb2 kernel: Leave: ip_vs_nat_xmit, ip_vs_conn.c line 818
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_out, ip_vs_core.c line 646
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/out TCP
192.168.1.1:80->196.15.202.161:26421 not hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: packet for TCP 196.15.202.161:26421
continue traversal as normal.
Jan 7 10:28:28 ulweb2 kernel: IPVS: lookup/in TCP
196.36.146.100:1157->163.200.147.19:443 hit
Jan 7 10:28:28 ulweb2 kernel: IPVS: Incoming TCP
196.36.146.100:1157->163.200.147.19:443
Jan 7 10:28:28 ulweb2 kernel: Enter: ip_vs_nat_xmit, ip_vs_conn.c line 683
Jan 7 10:28:29 ulweb2 kernel: IPVS: NAT to 192.168.1.13:443
Kind regards
Johan van den Berg
---------------------------------------------------------------------------
This message (and attachments) is subject to restrictions and a disclaimer.
Please refer to http://www.unisa.ac.za/disclaimer for full details.
---------------------------------------------------------------------------
<<<<gwavasig>>>>
<<<< gwavasig >>>>
|