LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] LVS-DR Cluster Some Real Servers Stuck in SYN_RECV

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] LVS-DR Cluster Some Real Servers Stuck in SYN_RECV
From: Bruce Rudolph <brudolph@xxxxxxxxxxx>
Date: Mon, 03 Mar 2014 09:32:06 -0500
Julian,

Thanks for the suggestions. The following shows the results with the 
failing servers:

    # tcpdump -lennnvvv -i any port http
    tcpdump: listening on any, link-type LINUX_SLL (Linux cooked),
    capture size 65535 bytes
    18:21:12.346348  In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
    length 80: (tos 0x28, ttl 53, id 52608, offset 0, flags [DF], proto
    TCP (6), length 64)
         <CIP>.62628 > <VIP>.80: Flags [S], cksum 0x3e62 (correct), seq
    4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
    3844971164 ecr 0,sackOK,eol], length 0
    18:21:12.346386 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <VIP>.80 > <CIP//>.62628: Flags [S.], cksum 0xf2a9 (correct),
    seq 4207299083, ack 4011092519, win 14480, options [mss
    1460,sackOK,TS val 82369115 ecr 3844971164,nop,wscale 7], length 0
    18:21:13.478479 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xee3c (correct), seq
    4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
    val 82370248 ecr 3844971164,nop,wscale 7], length 0
    18:21:13.550009  In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
    length 80: (tos 0x28, ttl 53, id 21930, offset 0, flags [DF], proto
    TCP (6), length 64)
         <CIP>.62628 > <VIP>.80: Flags [S], cksum 0x39b5 (correct), seq
    4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
    3844972361 ecr 0,sackOK,eol], length 0
    18:21:13.550032 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xedf5 (correct), seq
    4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
    val 82370319 ecr 3844971164,nop,wscale 7], length 0
    18:21:14.666596  In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
    length 80: (tos 0x28, ttl 53, id 24982, offset 0, flags [DF], proto
    TCP (6), length 64)
         <CIP>.62628 > <VIP>.80: Flags [S], cksum 0x356e (correct), seq
    4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
    3844973456 ecr 0,sackOK,eol], length 0
    18:21:14.666626 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xe998 (correct), seq
    4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
    val 82371436 ecr 3844971164,nop,wscale 7], length 0
    18:21:15.478479 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xe66c (correct), seq
    4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
    val 82372248 ecr 3844971164,nop,wscale 7], length 0
    18:21:15.758857  In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
    length 80: (tos 0x28, ttl 53, id 40934, offset 0, flags [DF], proto
    TCP (6), length 64)


The pattern above shows the cycle
             CIP                    VIP
             -----                   -----
             SYN   ---------->
                      <---------   SYN-ACK
                      <---------   SYN-ACK   (1+ seconds later)
             SYN   ---------->
                      <---------   SYN-ACK
                      <---------   SYN-ACK   (1+ seconds later)


In the same environment for the real servers that are failing I can send 
the request to the RIP successfully. tcpdump output follows

    # tcpdump -lennnvvv -i any port http
    tcpdump: listening on any, link-type LINUX_SLL (Linux cooked),
    capture size 65535 bytes
    15:25:35.287886  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 80: (tos 0x28, ttl 53, id 20068, offset 0, flags [DF], proto
    TCP (6), length 64)
         <CIP>.52747 > <RIP>.80: Flags [S], cksum 0x6bde (correct), seq
    2178856449, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
    3920435549 ecr 0,sackOK,eol], length 0
    15:25:35.287937 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
    (6), length 60)
         <RIP>.80 > <CIP>.52747: Flags [S.], cksum 0xde4a (correct), seq
    242406834, ack 2178856450, win 14480, options [mss 1460,sackOK,TS
    val 56852073 ecr 3920435549,nop,wscale 7], length 0
    15:25:35.401916  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 68: (tos 0x28, ttl 53, id 10796, offset 0, flags [DF], proto
    TCP (6), length 52)
         <CIP>.52747 > <RIP>.80: Flags [.], cksum 0xc321 (correct), seq
    1, ack 1, win 33304, options [nop,nop,TS val 3920435658 ecr
    56852073], length 0
    15:25:43.297092  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 87: (tos 0x28, ttl 53, id 38439, offset 0, flags [DF], proto
    TCP (6), length 71)
         <CIP>.52747 > <RIP>.80: Flags [P.], cksum 0x9558 (correct), seq
    1:20, ack 1, win 33304, options [nop,nop,TS val 3920443505 ecr
    56852073], length 19
    15:25:43.297119 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 68: (tos 0x0, ttl 64, id 43880, offset 0, flags [DF], proto
    TCP (6), length 52)
         <RIP>.80 > <CIP>.52747: Flags [.], cksum 0x06c5 (correct), seq
    1, ack 20, win 114, options [nop,nop,TS val 56860082 ecr
    3920443505], length 0
    15:25:43.300061 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 72: (tos 0x0, ttl 64, id 43881, offset 0, flags [DF], proto
    TCP (6), length 56)
         <RIP>.80 > <CIP>.52747: Flags [P.], cksum 0xc206 (incorrect ->
    0x27df), seq 1:5, ack 20, win 114, options [nop,nop,TS val 56860085
    ecr 3920443505], length 4
    15:25:43.300077 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 68: (tos 0x0, ttl 64, id 43882, offset 0, flags [DF], proto
    TCP (6), length 52)
         <RIP>.80 > <CIP>.52747: Flags [F.], cksum 0x06bd (correct), seq
    5, ack 20, win 114, options [nop,nop,TS val 56860085 ecr
    3920443505], length 0
    15:25:43.414941  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 68: (tos 0x28, ttl 53, id 36982, offset 0, flags [DF], proto
    TCP (6), length 52)
         <CIP>.52747 > <RIP>.80: Flags [.], cksum 0x84aa (correct), seq
    20, ack 5, win 33302, options [nop,nop,TS val 3920443616 ecr
    56860085], length 0
    15:25:43.414964  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 68: (tos 0x28, ttl 53, id 11379, offset 0, flags [DF], proto
    TCP (6), length 52)
         <CIP>.52747 > <RIP>.80: Flags [.], cksum 0x84a6 (correct), seq
    20, ack 6, win 33304, options [nop,nop,TS val 3920443617 ecr
    56860085], length 0
    15:25:43.414974  In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
    length 68: (tos 0x28, ttl 53, id 13828, offset 0, flags [DF], proto
    TCP (6), length 52)
         <CIP>.52747 > <RIP>.80: Flags [F.], cksum 0x84a5 (correct), seq
    20, ack 6, win 33304, options [nop,nop,TS val 3920443617 ecr
    56860085], length 0
    15:25:43.414986 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
    length 68: (tos 0x0, ttl 64, id 43883, offset 0, flags [DF], proto
    TCP (6), length 52)
         <RIP>.80 > <CIP>.52747: Flags [.], cksum 0x05d9 (correct), seq
    6, ack 21, win 114, options [nop,nop,TS val 56860200 ecr
    3920443617], length 0

This at least proves the NIC card is working.

Again, the odd thing is that one real server works when sending to the 
VIP. Still trying to find a difference but these were all setup the same 
and so far no good.

Thanks again for the additional insights.

Regards,
Bruce

On 3/1/14 2:01 PM, Julian Anastasov wrote:
>       Hello,
>
> On Sat, 1 Mar 2014, Bruce Rudolph wrote:
>
>> My current findings.
>>
>> The overall LVS cluster is working at a degraded performance because
>> four of the five real servers are failing. The failure is strange. When
>> a client sends a request to the VIP (Virtual IP address) the LVS
>> Director (load balancer) distributes it to one of the real servers based
>> on the scheduling algorithm (LC).
>>
>> Legend for the examples
>>
>>      VIP = Virtual IP Address for the LVS cluster
>>      DIR = the LVS Director or Load Balancer
>>      RS = Real Server - the web service we have running listening on port 80
>>
>>
>> The servers that are failing are doing so because of the following sequence:
>> ERROR SEQUENCE
>>
>>      Client sends SYN to VIP
>>      DIR forwards SYN to an available RS
>>      RS receives the SYN and responds to Client with SYN-ACK
>       If there is reponse, check on real server that
> it is correct:
>
> 1. It should contain VIP in saddr in IP header. This is expected
> because director should send the request to real server
> with VIP in daddr. Also, the client should see the same
> server port (vport) in the response.
>
> 2. 'tcpdump -lennn src host VIP' on real server can show
> to which destination MAC is sent the response
>
> 3. If it is going via director you can notice it with
> tcpdump also on director. I guess, DR setups do not use
> director for responses, otherwise they would use NAT mode
> to avoid the source spoofing checks. I guess all your
> real servers use same default gateway.
>
>>      Client does not receive the SYN-ACK so it never sends an ACK. It
>>      continues to send a SYN trying to establish a connection until the
>>      timeout. THIS IS THE FAILURE POINT.
> Regards
>
> --
> Julian Anastasov <ja@xxxxxx>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>