LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: LVS NAT SYN/ACK Packet Rewriting Problem

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: LVS NAT SYN/ACK Packet Rewriting Problem
Cc: Shaun Donovan <sdonovan@xxxxxxxxx>
From: Roberto Nibali <ratz@xxxxxxxxxxxx>
Date: Wed, 15 Dec 2004 23:45:15 +0100
Hello,

I have a LVS Director listening to about 20 IP's, and forwarding the requests for HTTP/HTTPS/SSH/FTP/POP3 etc to 7 different real servers, Linux and Microsoft alike.

What schedulers do you use? wrr? wlc?

Not quite sure if I am correct, but I really think it is load related, because it mostly only happens when my websites take quite a knock.

Interesting.

Here is a normal tcp connection, using tcpdump -i any host [client ip] port 80 on the director :

Could you maybe either add -n to tcpdump or sed the output for me please, next time :). I'm lost with names in tcpdump output, too many characters to read it fluently.

[correct tcpdump interpretation]

And here is what happens every now and again when things go wrong:

16:55:35.406321 pc-2178249.unisa.ac.za.48704 > www2.unisa.ac.za.http: S 244216559:244216559(0) win 5840 <mss 1460,sackOK,timestamp 15490025 0,nop,wscale 7> (DF) # Client tries to connect to www2.unisa.ac.za by sending a SYN packet

16:55:35.406340 pc-2178249.unisa.ac.za.48704 > umweb2.cluster.unisa.ac.za.http: S 244216559:244216559(0) win 5840 <mss 1460,sackOK,timestamp 15490025 0,nop,wscale 7> (DF) # IPVS rewrites the packet destination to the real server, umweb2.cluster.unisa.ac.za

16:55:35.406424 umweb2.cluster.unisa.ac.za.http > pc-2178249.unisa.ac.za.48704: S 424874716:424874716(0) ack 244216560 win 65535 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF) # Real server responds correctly with a SYN accompanied by an ACK for the original SYN

16:55:35.406521 ulweb4.unisa.ac.za.http > pc-2178249.unisa.ac.za.48704: S 424874716:424874716(0) ack 244216560 win 65535 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK> (DF)

Odd! Looks almost like a bucket lookup bug.

# But IPVS rewrites the packet incorrectly, and now the packet seems to come from a different host ???? The VIP it uses here is a valid VIP on the director, but there is no reason why it should use this VIP, and not www2.unisa.ac.za, which was the originally requested VIP in the first place...

Would it be possible for you to capture this trace again but during this time also enable vs_debug in proc-fs? Also the output of /proc/net/ip_vs_conn and /proc/net/ip_vs with the relevant IPs in question?

16:55:35.406731 pc-2178249.unisa.ac.za.48704 > ulweb4.unisa.ac.za.http: R 244216560:244216560(0) win 0 (DF) # Client gets a SYN/ACK from an unknown host, not related to the original request, and therefore sends the RESET back to the sending host

Exactly.

16:55:35.406760 pc-2178249.unisa.ac.za.48704 > umweb2.cluster.unisa.ac.za.http: R 244216560:244216560(0) win 0 (DF) # I'm guessing that iptables nat rewrites the RESET to route back to the original sender.

Hmmm, I don't know if netfilter is involved in that part. But maybe you're right, since the lookup for a service template for ulweb4 is definitely going to fail for an established connection, the conntrack should take care of it and send it to umweb2. However to be honest, I'm not sure here.

This is causing hanging and timeouts from the clients (the whole world). Really an urgent issue.

So you have persistency on your services? This of course adds to the hangs.

Does anyone have any advice?

Not yet, but something looks extremely fishy.

Best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

<Prev in Thread] Current Thread [Next in Thread>