LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: path mtu discovery...

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE: path mtu discovery...
From: DaP <dap@xxxxxxxxxxxxx>
Date: Thu, 17 May 2001 15:24:48 +0200 (CEST)
On Thu, 17 May 2001, Julian Anastasov wrote:
> >   does it work with kernel 2.4.2 and ipvs-0.2.6 for anyone?  lvm server
> > silently drops 'need to frag' replies for me..
>       lvm==lvs ?
 of course

> In which direction the messages are dropped?
>       tcpdump output? LVS should forward the related ICMP messages
> to the right real server. Can you explain more your setup? MTUs,
> topology, forwarding method, etc?


 real server
      |      eth2: 10.1.1.1/24 (sniff)
      |
      |      MTU 1500
      |
      |      eth0: 10.1.1.121/24 (sniff)
      |      eth0: 217.20.134.240/28
      |        TCP  217.20.134.241:80 rr
      |          -> 10.1.1.1:80                    Masq    1
      |        Chain POSTROUTING (policy ACCEPT)
      |          MASQUERADE  all  --  10.1.1.0/24          0.0.0.0/0
   director
      |      eth1: 217.20.130.15/24
      |
      |      MTU 1500
      |
      |      eth0: 217.20.130.10/24 (sniff)
    router
      |      eth1: 192.168.1.1/16
      |
      |      MTU 1000
      |
      |      eth0: 192.168.3.31/16 (sniff)
    client

 the VIPs are aliased to the director's eth0 interface, cause we used a
special failover technilogy..  but this can't be the cause of the problem.


*** real server ***
14:24:21.201478 192.168.3.31.3438 > 10.1.1.1.443: S
        437781271:437781271(0) win 5840 <mss 1460,sackOK,timestamp 
1837394[|tcp]>
        (DF) [tos 0x10]
14:24:21.201529 10.1.1.1.443 > 192.168.3.31.3438: S
        432414437:432414437(0) ack 437781272 win 30660 <mss 
1460,sackOK,timestamp
        392105[|tcp]> (DF)
14:24:21.202193 192.168.3.31.3438 > 10.1.1.1.443: . ack 1 win 5840
        <nop,nop,timestamp 1837394 392105> (DF) [tos 0x10]
14:24:24.518307 192.168.3.31.3438 > 10.1.1.1.443: P 1:17(16) ack 1 win
        5840 <nop,nop,timestamp 1837726 392105> (DF) [tos 0x10]
14:24:24.518324 10.1.1.1.443 > 192.168.3.31.3438: . ack 17 win 30660
        <nop,nop,timestamp 392436 1837726> (DF)
14:24:24.690675 192.168.3.31.3438 > 10.1.1.1.443: P 17:19(2) ack 1 win
        5840 <nop,nop,timestamp 1837743 392436> (DF) [tos 0x10]
14:24:24.693701 10.1.1.1.443 > 192.168.3.31.3438: P 1:1449(1448) ack 19
        win 31856 <nop,nop,timestamp 392454 1837743> (DF)
^^^ first big packet
14:24:24.693711 10.1.1.1.443 > 192.168.3.31.3438: P 1449:2897(1448) ack 19
        win 31856 <nop,nop,timestamp 392454 1837743> (DF)
14:24:24.892564 192.168.3.31.3438 > 10.1.1.1.443: P 17:19(2) ack 1 win
        5840 <nop,nop,timestamp 1837764 392436> (DF) [tos 0x10]
14:24:24.892588 10.1.1.1.443 > 192.168.3.31.3438: . ack 19 win 31856
        <nop,nop,timestamp 392474 1837764> (DF)
14:24:27.689263 10.1.1.1.443 > 192.168.3.31.3438: P 1:1449(1448) ack 19
        win 31856 <nop,nop,timestamp 392754 1837764> (DF)
^^^ resend

10_1_1_101:/home/dap# ip ro show table cache | grep -A 1 192.168.3.31
192.168.3.31 from 10.1.1.1 via 10.1.1.121 dev eth2 
    cache  mtu 1500 rtt 375ms
--
local 10.1.1.1 from 192.168.3.31 tos lowdelay dev lo  src 10.1.1.1 
    cache <local>  iif eth2


*** director eth0 ***
14:24:21.282107 192.168.3.31.3438 > 10.1.1.1.443: S
        437781271:437781271(0) win 5840 <mss 1460,sackOK,timestamp 
1837394[|tcp]>
        (DF) [tos 0x10]
14:24:21.282418 217.20.134.241.443 > 192.168.3.31.3438: S
        432414437:432414437(0) ack 437781272 win 30660 <mss 
1460,sackOK,timestamp
        392105[|tcp]> (DF)
14:24:21.282876 192.168.3.31.3438 > 10.1.1.1.443: . ack 432414438 win 5840
        <nop,nop,timestamp 1837394 392105> (DF) [tos 0x10]
14:24:24.598962 192.168.3.31.3438 > 10.1.1.1.443: P 0:16(16) ack 1 win
        5840 <nop,nop,timestamp 1837726 392105> (DF) [tos 0x10]
14:24:24.599308 217.20.134.241.443 > 192.168.3.31.3438: . ack 17 win 30660
        <nop,nop,timestamp 392436 1837726> (DF)
14:24:24.771413 192.168.3.31.3438 > 10.1.1.1.443: P 16:18(2) ack 1 win
        5840 <nop,nop,timestamp 1837743 392436> (DF) [tos 0x10]
14:24:24.774786 217.20.134.241.443 > 192.168.3.31.3438: P 1:1449(1448) ack
        19 win 31856 <nop,nop,timestamp 392454 1837743> (DF)
14:24:24.774899 217.20.134.241.443 > 192.168.3.31.3438: P
        1449:2897(1448) ack 19 win 31856 <nop,nop,timestamp 392454 1837743> (DF)
14:24:24.775359 217.20.130.10 > 217.20.134.241: icmp: 192.168.3.31
        unreachable - need to frag (mtu 1024) (DF) [tos 0xc0]
14:24:24.775507 217.20.130.10 > 217.20.134.241: icmp: 192.168.3.31
        unreachable - need to frag (mtu 1024) (DF) [tos 0xc0]
^^^ got the 'neet to frag'
14:24:24.973307 192.168.3.31.3438 > 10.1.1.1.443: P 16:18(2) ack 1 win
        5840 <nop,nop,timestamp 1837764 392436> (DF) [tos 0x10]
14:24:24.973497 217.20.134.241.443 > 192.168.3.31.3438: . ack 19 win 31856
        <nop,nop,timestamp 392474 1837764> (DF)
14:24:27.770389 217.20.134.241.443 > 192.168.3.31.3438: P 1:1449(1448) ack
        19 win 31856 <nop,nop,timestamp 392754 1837764> (DF)
^^^ received a big packet again
14:24:27.770970 217.20.130.10 > 217.20.134.241: icmp: 192.168.3.31
        unreachable - need to frag (mtu 1024) (DF) [tos 0xc0]

there is nothing interesting in the routing cache, the 'need to frag'
messages do not pass, while 'dest unreachable' do:
14:24:16.688657 10.1.1.121 > 10.1.1.1: icmp: 195.228.210.26 tcp port 2560
        unreachable (DF) [tos 0xc0]


*** router eth0 ***
14:24:21.234780 192.168.3.31.3438 > 217.20.134.241.443: S
        437781271:437781271(0) win 5840 <mss 1460,sackOK,timestamp 
1837394[|tcp]>
        (DF) [tos 0x10]
14:24:21.235207 217.20.134.241.443 > 192.168.3.31.3438: S
        432414437:432414437(0) ack 437781272 win 30660 <mss 
1460,sackOK,timestamp
        392105[|tcp]> (DF)
14:24:21.235559 192.168.3.31.3438 > 217.20.134.241.443: . ack 1 win 5840
        <nop,nop,timestamp 1837394 392105> (DF) [tos 0x10]
14:24:24.551623 192.168.3.31.3438 > 217.20.134.241.443: P 1:17(16) ack 1
        win 5840 <nop,nop,timestamp 1837726 392105> (DF) [tos 0x10]
14:24:24.552074 217.20.134.241.443 > 192.168.3.31.3438: . ack 17 win 30660
        <nop,nop,timestamp 392436 1837726> (DF)
14:24:24.724078 192.168.3.31.3438 > 217.20.134.241.443: P 17:19(2) ack 1
        win 5840 <nop,nop,timestamp 1837743 392436> (DF) [tos 0x10]
14:24:24.727815 217.20.134.241.443 > 192.168.3.31.3438: P 1:1449(1448) ack
        19 win 31856 <nop,nop,timestamp 392454 1837743> (DF)
14:24:24.727957 217.20.134.241.443 > 192.168.3.31.3438: P
        1449:2897(1448) ack 19 win 31856 <nop,nop,timestamp 392454 1837743> (DF)
^^^ these big packets don't pass
14:24:24.925973 192.168.3.31.3438 > 217.20.134.241.443: P 17:19(2) ack 1
        win 5840 <nop,nop,timestamp 1837764 392436> (DF) [tos 0x10]
14:24:24.926274 217.20.134.241.443 > 192.168.3.31.3438: . ack 19 win 31856
        <nop,nop,timestamp 392474 1837764> (DF)
14:24:27.723410 217.20.134.241.443 > 192.168.3.31.3438: P 1:1449(1448) ack
        19 win 31856 <nop,nop,timestamp 392754 1837764> (DF)

 outgoing icmp packets are not shown, because of an error in the tcpdump
filter, but as they appear on the other end, there is no need to repeat
the test.  ;)


*** client ***
14:24:21.270266 192.168.3.31.3438 > 217.20.134.241.443: S
        437781271:437781271(0) win 5840 <mss 1460,sackOK,timestamp 1837394
        0,nop,wscale 0> (DF) [tos 0x10] 
14:24:21.271035 217.20.134.241.443 > 192.168.3.31.3438: S
        432414437:432414437(0) ack 437781272 win 30660 <mss 
1460,sackOK,timestamp
        392105 1837394,nop,wscale 0> (DF)
14:24:21.271204 192.168.3.31.3438 > 217.20.134.241.443: . ack 1 win 5840
        <nop,nop,timestamp 1837394 392105> (DF) [tos 0x10] 
14:24:24.587285 192.168.3.31.3438 > 217.20.134.241.443: P 1:17(16) ack 1
        win 5840 <nop,nop,timestamp 1837726 392105> (DF) [tos 0x10] 
14:24:24.587889 217.20.134.241.443 > 192.168.3.31.3438: . ack 17 win 30660
        <nop,nop,timestamp 392436 1837726> (DF)
14:24:24.759747 192.168.3.31.3438 > 217.20.134.241.443: P 17:19(2) ack 1
        win 5840 <nop,nop,timestamp 1837743 392436> (DF) [tos 0x10] 
14:24:24.961628 192.168.3.31.3438 > 217.20.134.241.443: P 17:19(2) ack 1
        win 5840 <nop,nop,timestamp 1837764 392436> (DF) [tos 0x10] 
14:24:24.962167 217.20.134.241.443 > 192.168.3.31.3438: . ack 19 win 31856
        <nop,nop,timestamp 392474 1837764> (DF)


--
  DaP




<Prev in Thread] Current Thread [Next in Thread>