Julian,
Thanks for the suggestions. The following shows the results with the
failing servers:
# tcpdump -lennnvvv -i any port http
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked),
capture size 65535 bytes
18:21:12.346348 In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
length 80: (tos 0x28, ttl 53, id 52608, offset 0, flags [DF], proto
TCP (6), length 64)
<CIP>.62628 > <VIP>.80: Flags [S], cksum 0x3e62 (correct), seq
4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
3844971164 ecr 0,sackOK,eol], length 0
18:21:12.346386 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<VIP>.80 > <CIP//>.62628: Flags [S.], cksum 0xf2a9 (correct),
seq 4207299083, ack 4011092519, win 14480, options [mss
1460,sackOK,TS val 82369115 ecr 3844971164,nop,wscale 7], length 0
18:21:13.478479 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xee3c (correct), seq
4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
val 82370248 ecr 3844971164,nop,wscale 7], length 0
18:21:13.550009 In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
length 80: (tos 0x28, ttl 53, id 21930, offset 0, flags [DF], proto
TCP (6), length 64)
<CIP>.62628 > <VIP>.80: Flags [S], cksum 0x39b5 (correct), seq
4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
3844972361 ecr 0,sackOK,eol], length 0
18:21:13.550032 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xedf5 (correct), seq
4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
val 82370319 ecr 3844971164,nop,wscale 7], length 0
18:21:14.666596 In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
length 80: (tos 0x28, ttl 53, id 24982, offset 0, flags [DF], proto
TCP (6), length 64)
<CIP>.62628 > <VIP>.80: Flags [S], cksum 0x356e (correct), seq
4011092518, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
3844973456 ecr 0,sackOK,eol], length 0
18:21:14.666626 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xe998 (correct), seq
4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
val 82371436 ecr 3844971164,nop,wscale 7], length 0
18:21:15.478479 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<VIP>.80 > <CIP>.62628: Flags [S.], cksum 0xe66c (correct), seq
4207299083, ack 4011092519, win 14480, options [mss 1460,sackOK,TS
val 82372248 ecr 3844971164,nop,wscale 7], length 0
18:21:15.758857 In 68:05:ca:18:61:c1 ethertype IPv4 (0x0800),
length 80: (tos 0x28, ttl 53, id 40934, offset 0, flags [DF], proto
TCP (6), length 64)
The pattern above shows the cycle
CIP VIP
----- -----
SYN ---------->
<--------- SYN-ACK
<--------- SYN-ACK (1+ seconds later)
SYN ---------->
<--------- SYN-ACK
<--------- SYN-ACK (1+ seconds later)
In the same environment for the real servers that are failing I can send
the request to the RIP successfully. tcpdump output follows
# tcpdump -lennnvvv -i any port http
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked),
capture size 65535 bytes
15:25:35.287886 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 80: (tos 0x28, ttl 53, id 20068, offset 0, flags [DF], proto
TCP (6), length 64)
<CIP>.52747 > <RIP>.80: Flags [S], cksum 0x6bde (correct), seq
2178856449, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val
3920435549 ecr 0,sackOK,eol], length 0
15:25:35.287937 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 76: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 60)
<RIP>.80 > <CIP>.52747: Flags [S.], cksum 0xde4a (correct), seq
242406834, ack 2178856450, win 14480, options [mss 1460,sackOK,TS
val 56852073 ecr 3920435549,nop,wscale 7], length 0
15:25:35.401916 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 68: (tos 0x28, ttl 53, id 10796, offset 0, flags [DF], proto
TCP (6), length 52)
<CIP>.52747 > <RIP>.80: Flags [.], cksum 0xc321 (correct), seq
1, ack 1, win 33304, options [nop,nop,TS val 3920435658 ecr
56852073], length 0
15:25:43.297092 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 87: (tos 0x28, ttl 53, id 38439, offset 0, flags [DF], proto
TCP (6), length 71)
<CIP>.52747 > <RIP>.80: Flags [P.], cksum 0x9558 (correct), seq
1:20, ack 1, win 33304, options [nop,nop,TS val 3920443505 ecr
56852073], length 19
15:25:43.297119 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 68: (tos 0x0, ttl 64, id 43880, offset 0, flags [DF], proto
TCP (6), length 52)
<RIP>.80 > <CIP>.52747: Flags [.], cksum 0x06c5 (correct), seq
1, ack 20, win 114, options [nop,nop,TS val 56860082 ecr
3920443505], length 0
15:25:43.300061 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 72: (tos 0x0, ttl 64, id 43881, offset 0, flags [DF], proto
TCP (6), length 56)
<RIP>.80 > <CIP>.52747: Flags [P.], cksum 0xc206 (incorrect ->
0x27df), seq 1:5, ack 20, win 114, options [nop,nop,TS val 56860085
ecr 3920443505], length 4
15:25:43.300077 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 68: (tos 0x0, ttl 64, id 43882, offset 0, flags [DF], proto
TCP (6), length 52)
<RIP>.80 > <CIP>.52747: Flags [F.], cksum 0x06bd (correct), seq
5, ack 20, win 114, options [nop,nop,TS val 56860085 ecr
3920443505], length 0
15:25:43.414941 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 68: (tos 0x28, ttl 53, id 36982, offset 0, flags [DF], proto
TCP (6), length 52)
<CIP>.52747 > <RIP>.80: Flags [.], cksum 0x84aa (correct), seq
20, ack 5, win 33302, options [nop,nop,TS val 3920443616 ecr
56860085], length 0
15:25:43.414964 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 68: (tos 0x28, ttl 53, id 11379, offset 0, flags [DF], proto
TCP (6), length 52)
<CIP>.52747 > <RIP>.80: Flags [.], cksum 0x84a6 (correct), seq
20, ack 6, win 33304, options [nop,nop,TS val 3920443617 ecr
56860085], length 0
15:25:43.414974 In 00:23:9c:10:e0:41 ethertype IPv4 (0x0800),
length 68: (tos 0x28, ttl 53, id 13828, offset 0, flags [DF], proto
TCP (6), length 52)
<CIP>.52747 > <RIP>.80: Flags [F.], cksum 0x84a5 (correct), seq
20, ack 6, win 33304, options [nop,nop,TS val 3920443617 ecr
56860085], length 0
15:25:43.414986 Out e4:11:5b:ae:f9:e5 ethertype IPv4 (0x0800),
length 68: (tos 0x0, ttl 64, id 43883, offset 0, flags [DF], proto
TCP (6), length 52)
<RIP>.80 > <CIP>.52747: Flags [.], cksum 0x05d9 (correct), seq
6, ack 21, win 114, options [nop,nop,TS val 56860200 ecr
3920443617], length 0
This at least proves the NIC card is working.
Again, the odd thing is that one real server works when sending to the
VIP. Still trying to find a difference but these were all setup the same
and so far no good.
Thanks again for the additional insights.
Regards,
Bruce
On 3/1/14 2:01 PM, Julian Anastasov wrote:
> Hello,
>
> On Sat, 1 Mar 2014, Bruce Rudolph wrote:
>
>> My current findings.
>>
>> The overall LVS cluster is working at a degraded performance because
>> four of the five real servers are failing. The failure is strange. When
>> a client sends a request to the VIP (Virtual IP address) the LVS
>> Director (load balancer) distributes it to one of the real servers based
>> on the scheduling algorithm (LC).
>>
>> Legend for the examples
>>
>> VIP = Virtual IP Address for the LVS cluster
>> DIR = the LVS Director or Load Balancer
>> RS = Real Server - the web service we have running listening on port 80
>>
>>
>> The servers that are failing are doing so because of the following sequence:
>> ERROR SEQUENCE
>>
>> Client sends SYN to VIP
>> DIR forwards SYN to an available RS
>> RS receives the SYN and responds to Client with SYN-ACK
> If there is reponse, check on real server that
> it is correct:
>
> 1. It should contain VIP in saddr in IP header. This is expected
> because director should send the request to real server
> with VIP in daddr. Also, the client should see the same
> server port (vport) in the response.
>
> 2. 'tcpdump -lennn src host VIP' on real server can show
> to which destination MAC is sent the response
>
> 3. If it is going via director you can notice it with
> tcpdump also on director. I guess, DR setups do not use
> director for responses, otherwise they would use NAT mode
> to avoid the source spoofing checks. I guess all your
> real servers use same default gateway.
>
>> Client does not receive the SYN-ACK so it never sends an ACK. It
>> continues to send a SYN trying to establish a connection until the
>> timeout. THIS IS THE FAILURE POINT.
> Regards
>
> --
> Julian Anastasov <ja@xxxxxx>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|