LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] LVS-DR Cluster Some Real Servers Stuck in SYN_RECV

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [lvs-users] LVS-DR Cluster Some Real Servers Stuck in SYN_RECV
From: Bruce Rudolph <brudolph@xxxxxxxxxxx>
Date: Sat, 01 Mar 2014 12:38:28 -0500
My current findings.

The overall LVS cluster is working at a degraded performance because 
four of the five real servers are failing. The failure is strange. When 
a client sends a request to the VIP (Virtual IP address) the LVS 
Director (load balancer) distributes it to one of the real servers based 
on the scheduling algorithm (LC).

Legend for the examples

    VIP = Virtual IP Address for the LVS cluster
    DIR = the LVS Director or Load Balancer
    RS = Real Server - the web service we have running listening on port 80


The servers that are failing are doing so because of the following sequence:
ERROR SEQUENCE

    Client sends SYN to VIP
    DIR forwards SYN to an available RS
    RS receives the SYN and responds to Client with SYN-ACK
    Client does not receive the SYN-ACK so it never sends an ACK. It
    continues to send a SYN trying to establish a connection until the
    timeout. THIS IS THE FAILURE POINT.

The one working real server has the following sequence:
SUCCESS SEQUENCE

    Client send SYN to VIP
    DIR forwards SYN to an available RS
    RS receives the SYN and responds to Client with SYN-ACK
    Client receives the SYN-ACK and sends an ACK
    Client sends data packet (service request)
    RS receives data packet
    RS pushes data to the application
    RS sends ACK to Client
    RS application sends response data packet to Client
    RS sends FIN to Client
    Client receives response data and sends ACK
    Client sends ACK to RS (for the FIN)
    Client sends FIN to RS
    RS sends ACK to Client (the connection is closed)

IMPORTANT: I can send the same request to the Real Servers public IP 
(RIP) address rather than the VIP and each real server responds correctly.

The working real server was setup the same as the currently broken real 
servers. I have not found why the broken real servers send a SYN-ACK, 
directed to the Client, but it is never received at the client. Since 
the Client doesn't receive the SYN-ACK it keeps sending SYNs until a 
timeout closes the request. The session on the real server is stuck in 
SYN_RECV until it times out.

Any ideas given this scenario?

Bruce

On 2/28/14 3:23 PM, Malcolm Turnbull wrote:
> Bruce,
>
> You definitely only need one, and personally I find the iptables method 
> easiest.
> NB. Your apache instance must be configured to respond to the VIP as
> well as the RIP (heath checks are on the RIP)
> If you use a local web browser on the real server does it work when
> you connect to the VIP ? i.e.
>
> links x.x.x.x
>
> IF so then great but your routing is probably messed up by the lo:0 adapter.
>
>
>
>
>
>
> On 28 February 2014 20:01, Bruce Rudolph <brudolph@xxxxxxxxxxx> wrote:
>> I followed instructions from two sources
>>
>>        1)
>> http://www.centos.org/docs/5/html/Virtual_Server_Administration/s2-lvs-direct-iptables-VSA.html
>>
>>                  I updated iptables using the commands on this page.
>>
>>        2)
>> http://ptylr.com/2013/05/01/configuring-lvs-piranha-on-centos-for-direct-routing/
>>
>>                  This page had information on configuring lo:0 which was
>> the final step that I needed to get LVS-DR to work.
>>
>> The setup this way had been working since last August. It is still
>> working on one of the real servers but not on four other ones. Very odd.
>>
>>
>>
>> On 2/28/14 2:26 PM, Malcolm Turnbull wrote:
>>> snip --  "I have setup
>>> LVS-DR using IPTables."
>>>
>>> Then why are you using a loopback adapter as well?
>>>
>>> You only need to use one method iptables REDIRECT .... or ...
>>> loopbackadapter + arptables settings
>>>
>>> SYN_RECV means the real server is not replying when hit with a packet
>>> that says Hi are you the VIP?
>>>
>>>
>>>
>>> On 28 February 2014 19:21, Bruce Rudolph <brudolph@xxxxxxxxxxx> wrote:
>>>> I have an LVS-DR cluster which has been running for seven months without
>>>> a hitch. Recently, the cluster started to timeout on the majority of
>>>> connections. Some connections were passed through to a real server and
>>>> processed. I have tried for a week to figure out what happened. What I
>>>> found was that one real server out of five is connecting and servicing
>>>> the client request. The other four real servers have the HTTP connection
>>>> stuck in the SYN_RECV state until it times out (60 seconds).
>>>>
>>>> In summary, I have seven CentOS 6.4 servers (kernel
>>>> 2.6.32-358.18.1.el6.x86_64). Two servers are configured as load
>>>> balancers (a primary and a backup) and five real servers. I have setup
>>>> LVS-DR using IPTables. The servers have a public IP bound to a NIC
>>>> device and an internal VLAN bound to a second NIC. The VIP is configured
>>>> on the real servers local loopback (lo:0)  device. The
>>>> /etc/sysconfig/ha/lvs.cf was setup properly and everything was running
>>>> successfully for seven months.
>>>>
>>>> We installed new versions of our software for the web service we are
>>>> running. Nothing network related. All five real servers were updated the
>>>> same way. I am comparing the one working real server from the four that
>>>> are not working. So far I have found nothing.
>>>>
>>>> Any ideas on trouble shooting points?
>>>>
>>>> --
>>>> Best Regards,
>>>> Bruce
>>>>
>>>>
>>>> _______________________________________________
>>>> Please read the documentation before posting - it's available at:
>>>> http://www.linuxvirtualserver.org/
>>>>
>>>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>>>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>>>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
>

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>