LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] LVS-DR and scp

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] LVS-DR and scp
From: Scooter Morris <scooter@xxxxxxxxxxxx>
Date: Wed, 02 Dec 2009 19:25:02 -0800
OK, I've spent a bunch of time looking at this in more detail, and it 
looks like I've got an MTU/ICMP problem.  Here is a tcpdump between a 
client and the cluster taken from the client:

19:09:42.102169 IP client.46508 > cluster.ssh: . 123430:126326(2896) ack 
2318 win 190 <nop,nop,timestamp 1066530312 96022114>
19:09:42.102538 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556
19:09:42.302789 IP client.46508 > cluster.ssh: . 91574:93022(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530513 96022114>
19:09:42.303138 IP cluster.ssh > client.46508: . ack 93022 win 479 
<nop,nop,timestamp 96022315 1066530513,nop,nop,sack 1 {94470:97366}>
19:09:42.303158 IP client.46508 > cluster.ssh: P 126326:129222(2896) ack 
2318 win 190 <nop,nop,timestamp 1066530513 96022315>
19:09:42.303533 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556
19:09:42.503791 IP client.46508 > cluster.ssh: . 93022:94470(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530714 96022315>
19:09:42.504147 IP cluster.ssh > client.46508: . ack 97366 win 479 
<nop,nop,timestamp 96022516 1066530714>
19:09:42.504168 IP client.46508 > cluster.ssh: . 97366:98814(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530714 96022516>
19:09:42.504176 IP client.46508 > cluster.ssh: . 98814:100262(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530714 96022516>
19:09:42.504528 IP cluster > lclient: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556
19:09:42.704792 IP client.46508 > cluster.ssh: . 97366:98814(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530915 96022516>
19:09:42.705142 IP cluster.ssh > client.46508: . ack 98814 win 501 
<nop,nop,timestamp 96022717 1066530915>
19:09:42.705162 IP client.46508 > cluster.ssh: . 98814:100262(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530915 96022717>
19:09:42.705171 IP client.46508 > cluster.ssh: . 100262:101710(1448) ack 
2318 win 190 <nop,nop,timestamp 1066530915 96022717>
19:09:42.705528 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556

If I do the same on the redirector, I get:
19:14:45.659395 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556
19:14:45.864008 IP client.37736 > cluster.ssh: . 98813:100261(1448) ack 
2318 win 190 <nop,nop,timestamp 1066834074 96325684>
19:14:45.864016 IP client.37736 > cluster.ssh: . 98813:100261(1448) ack 
2318 win 190 <nop,nop,timestamp 1066834074 96325684>
19:14:45.864432 IP client.37736 > cluster.ssh: . 101709:104605(2896) ack 
2318 win 190 <nop,nop,timestamp 1066834074 96325889>
19:14:45.864445 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556
19:14:46.069007 IP client.37736 > cluster.ssh: . 100261:101709(1448) ack 
2318 win 190 <nop,nop,timestamp 1066834279 96325889>
19:14:46.069015 IP client.37736 > cluster.ssh: . 100261:101709(1448) ack 
2318 win 190 <nop,nop,timestamp 1066834279 96325889>
19:14:46.069381 IP client.37736 > cluster.ssh: . 101709:104605(2896) ack 
2318 win 190 <nop,nop,timestamp 1066834279 96326094>
19:14:46.069394 IP cluster > client: ICMP cluster unreachable - need to 
frag (mtu 1500), length 556

And on the real server, I get:
19:19:16.386707 IP client.37741 > cluster.ssh: . 149494:150942(1448) ack 
2366 win 190 <nop,nop,timestamp 1067104609 96596234>
19:19:16.386719 IP cluster.ssh > client.37741: . ack 150942 win 534 
<nop,nop,timestamp 96596437 1067104609>
19:19:16.588712 IP client.37741 > cluster.ssh: . 150942:152390(1448) ack 
2366 win 190 <nop,nop,timestamp 1067104811 96596437>
19:19:16.588726 IP cluster.ssh > client.37741: . ack 152390 win 534 
<nop,nop,timestamp 96596639 1067104811>
19:19:16.790747 IP client.37741 > cluster.ssh: . 152390:153838(1448) ack 
2366 win 190 <nop,nop,timestamp 1067105013 96596639>
19:19:16.790760 IP cluster.ssh > client.37741: . ack 153838 win 534 
<nop,nop,timestamp 96596841 1067105013>
19:19:16.993095 IP client.37741 > cluster.ssh: . 153838:155286(1448) ack 
2366 win 190 <nop,nop,timestamp 1067105215 96596841>
19:19:16.993107 IP cluster.ssh > client.37741: . ack 155286 win 534 
<nop,nop,timestamp 96597044 1067105215>
19:19:17.194740 IP client.37741 > cluster.ssh: . 155286:156734(1448) ack 
2366 win 190 <nop,nop,timestamp 1067105417 96597044>
19:19:17.194751 IP cluster.ssh > client.37741: . ack 156734 win 534 
<nop,nop,timestamp 96597245 1067105417>
19:19:17.396737 IP client.37741 > cluster.ssh: . 156734:158182(1448) ack 
2366 win 190 <nop,nop,timestamp 1067105619 96597245>
19:19:17.396749 IP cluster.ssh > client.37741: . ack 158182 win 534 
<nop,nop,timestamp 96597447 1067105619>
19:19:17.598733 IP client.37741 > cluster.ssh: . 158182:159630(1448) ack 
2366 win 190 <nop,nop,timestamp 1067105821 96597447>
19:19:17.598748 IP cluster.ssh > client.37741: . ack 159630 win 534 
<nop,nop,timestamp 96597649 1067105821>
19:19:17.800680 IP client.37741 > cluster.ssh: . 159630:161078(1448) ack 
2366 win 190 <nop,nop,timestamp 1067106023 96597649>
19:19:17.800692 IP cluster.ssh > client.37741: . ack 161078 win 534 
<nop,nop,timestamp 96597851 1067106023>
19:19:17.801015 IP client.37741 > cluster.ssh: . 161078:162526(1448) ack 
2366 win 190 <nop,nop,timestamp 1067106023 96597851>
19:19:17.801240 IP client.37741 > cluster.ssh: . 162526:163974(1448) ack 
2366 win 190 <nop,nop,timestamp 1067106023 96597851>
19:19:17.801251 IP cluster.ssh > client.37741: . ack 163974 win 534 
<nop,nop,timestamp 96597852 1067106023>
19:19:18.001682 IP client.37741 > cluster.ssh: . 163974:165422(1448) ack 
2366 win 190 <nop,nop,timestamp 1067106224 96597852>
19:19:18.041167 IP cluster.ssh > client.37741: . ack 165422 win 543 
<nop,nop,timestamp 96598092 1067106224>
19:19:18.246673 IP client.37741 > cluster.ssh: . 165422:166870(1448) ack 
2366 win 190 <nop,nop,timestamp 1067106469 96598092>
19:19:18.286152 IP cluster.ssh > client.37741: . ack 166870 win 543 
<nop,nop,timestamp 96598337 1067106469>

So, it looks to me like there is something going on the ICMP or the path 
MTU discovery between the client and the redirector, but this is using 
LVS-DR, so this shouldn't happen like it does with LVS-TUN, right?  I've 
poured over the HOWTO and done several google searches, but the solution 
to this still eludes me.  As another data point, this only happens when 
I scp data to the cluster, but when I pull data from the cluster using 
scp, I get great performance.

Thanks in advance for any help anyone can offer!

-- scooter

On 12/02/2009 01:26 AM, Graeme Fowler wrote:
> On Tue, 2009-12-01 at 15:02 -0800, Scooter Morris wrote:
>    
>> I have a two-node LVS configuration that I'm using as a front-end
>> to a three-node cluster.  It works really well for ssh, http, https,
>> etc.   But, when a user tries to use scp to copy a large file or
>> directory to the cluster, it commonly stalls and doesn't complete.
>>      
> OK...
>
>    
>> scp directly to any node of the cluster works fine, either from outside
>> the network or from inside it.
>>      
> OK...
>
>    
>> Things only crawl when using scp to go through the redirectors.  Here
>> are the relevant parts of the configuration file:
>>      
> That all looks sane enough.
>
> Question: how long does it take before a stall occurs? If it's around
> about 10 minutes... you should be able to see what I'm getting at there.
>
> We might need you to do some tests while running tcpdump, to see where
> the stall occurs.
>
> Graeme
>
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>    


_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>