My current findings.
The overall LVS cluster is working at a degraded performance because
four of the five real servers are failing. The failure is strange. When
a client sends a request to the VIP (Virtual IP address) the LVS
Director (load balancer) distributes it to one of the real servers based
on the scheduling algorithm (LC).
Legend for the examples
VIP = Virtual IP Address for the LVS cluster
DIR = the LVS Director or Load Balancer
RS = Real Server - the web service we have running listening on port 80
The servers that are failing are doing so because of the following sequence:
ERROR SEQUENCE
Client sends SYN to VIP
DIR forwards SYN to an available RS
RS receives the SYN and responds to Client with SYN-ACK
Client does not receive the SYN-ACK so it never sends an ACK. It
continues to send a SYN trying to establish a connection until the
timeout. THIS IS THE FAILURE POINT.
The one working real server has the following sequence:
SUCCESS SEQUENCE
Client send SYN to VIP
DIR forwards SYN to an available RS
RS receives the SYN and responds to Client with SYN-ACK
Client receives the SYN-ACK and sends an ACK
Client sends data packet (service request)
RS receives data packet
RS pushes data to the application
RS sends ACK to Client
RS application sends response data packet to Client
RS sends FIN to Client
Client receives response data and sends ACK
Client sends ACK to RS (for the FIN)
Client sends FIN to RS
RS sends ACK to Client (the connection is closed)
IMPORTANT: I can send the same request to the Real Servers public IP
(RIP) address rather than the VIP and each real server responds correctly.
The working real server was setup the same as the currently broken real
servers. I have not found why the broken real servers send a SYN-ACK,
directed to the Client, but it is never received at the client. Since
the Client doesn't receive the SYN-ACK it keeps sending SYNs until a
timeout closes the request. The session on the real server is stuck in
SYN_RECV until it times out.
Any ideas given this scenario?
Bruce
On 2/28/14 3:23 PM, Malcolm Turnbull wrote:
> Bruce,
>
> You definitely only need one, and personally I find the iptables method
> easiest.
> NB. Your apache instance must be configured to respond to the VIP as
> well as the RIP (heath checks are on the RIP)
> If you use a local web browser on the real server does it work when
> you connect to the VIP ? i.e.
>
> links x.x.x.x
>
> IF so then great but your routing is probably messed up by the lo:0 adapter.
>
>
>
>
>
>
> On 28 February 2014 20:01, Bruce Rudolph <brudolph@xxxxxxxxxxx> wrote:
>> I followed instructions from two sources
>>
>> 1)
>> http://www.centos.org/docs/5/html/Virtual_Server_Administration/s2-lvs-direct-iptables-VSA.html
>>
>> I updated iptables using the commands on this page.
>>
>> 2)
>> http://ptylr.com/2013/05/01/configuring-lvs-piranha-on-centos-for-direct-routing/
>>
>> This page had information on configuring lo:0 which was
>> the final step that I needed to get LVS-DR to work.
>>
>> The setup this way had been working since last August. It is still
>> working on one of the real servers but not on four other ones. Very odd.
>>
>>
>>
>> On 2/28/14 2:26 PM, Malcolm Turnbull wrote:
>>> snip -- "I have setup
>>> LVS-DR using IPTables."
>>>
>>> Then why are you using a loopback adapter as well?
>>>
>>> You only need to use one method iptables REDIRECT .... or ...
>>> loopbackadapter + arptables settings
>>>
>>> SYN_RECV means the real server is not replying when hit with a packet
>>> that says Hi are you the VIP?
>>>
>>>
>>>
>>> On 28 February 2014 19:21, Bruce Rudolph <brudolph@xxxxxxxxxxx> wrote:
>>>> I have an LVS-DR cluster which has been running for seven months without
>>>> a hitch. Recently, the cluster started to timeout on the majority of
>>>> connections. Some connections were passed through to a real server and
>>>> processed. I have tried for a week to figure out what happened. What I
>>>> found was that one real server out of five is connecting and servicing
>>>> the client request. The other four real servers have the HTTP connection
>>>> stuck in the SYN_RECV state until it times out (60 seconds).
>>>>
>>>> In summary, I have seven CentOS 6.4 servers (kernel
>>>> 2.6.32-358.18.1.el6.x86_64). Two servers are configured as load
>>>> balancers (a primary and a backup) and five real servers. I have setup
>>>> LVS-DR using IPTables. The servers have a public IP bound to a NIC
>>>> device and an internal VLAN bound to a second NIC. The VIP is configured
>>>> on the real servers local loopback (lo:0) device. The
>>>> /etc/sysconfig/ha/lvs.cf was setup properly and everything was running
>>>> successfully for seven months.
>>>>
>>>> We installed new versions of our software for the web service we are
>>>> running. Nothing network related. All five real servers were updated the
>>>> same way. I am comparing the one working real server from the four that
>>>> are not working. So far I have found nothing.
>>>>
>>>> Any ideas on trouble shooting points?
>>>>
>>>> --
>>>> Best Regards,
>>>> Bruce
>>>>
>>>>
>>>> _______________________________________________
>>>> Please read the documentation before posting - it's available at:
>>>> http://www.linuxvirtualserver.org/
>>>>
>>>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>>>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>>>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
>
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|