We are using CentOS 5.6 and the bundled piranha package (pulse, nanny,
ipvsadm) for load balancing. We would like to stay with the packages that
are bundled in the CentOS distribution.
We are using a two-arm configuration using NAT (masq). We are load
balancing multiple services that also need to communicate to other load
balanced services. We have LDAP servers, application servers, portal
servers, wiki servers, etc all sitting behind different director clusters
and each service is able to talk to other services. We are using round
robin scheduling. All of this was working great until we turned on
persistence. We have persistence set to 300 seconds. We are not seeing any
failures. However, we have come across a scenario that I am not sure how to
In this specific case we have 4 portal servers that make requests to the 4
application servers behind a director set. Lets call the 4 portal servers
p1-p4 and the 4 application server a1-a4. A user comes into the system via
a reverse proxy which then directs the traffic to the portal cluster and
the portal cluster requests data from the application server cluster.
When the system starts let's assume no users so no LVS connections in any
tables. When the first user accesses our system we see the first portal
server connect to the first application server - say p1 -> a1. This
continues until we end up with each portal servers connected to a
respective application server. For reference, we are a 24/hr system
meaning there is traffic hitting us around the clock.
For ease of description lets say the mapping ends up looking like this:
p1 -> a1
p2 -> a2
p3 -> a3
p4 -> a4
Since we have our persistence setting at 300 seconds and we have fairly
constant traffic, these associations rarely change since the persistence is
at layer 4 and not at layer 7.
Here is where my question comes in....
If we need to patch one of the application servers or if it dies (lets say
a4), it is removed from the application server farm just fine. When that
happens p4 then needs to connect to another application server. The
mapping could then end up looking like this:
p1 -> a1
p2 -> a2
p3 -> a3
p4 -> a1
Now, when a4 comes back online it is successfully added back to the
application server farm. However, we are never seeing it get associated
with a corresponding portal server. Server p1-p4 stay connected to their
current application server (in this case a1-a3). We have talked about
killing all connections on p4 (for example) and changing the weighting to
force the server to reconnect to a4. This will, however, cause the end
users to have their connections dropped - which is, of course, not
Am I over-thinking this and missing something glaring on how to get a4
re-introduced? If p1-p4 never drop their connections, how would a4 ever
get back "in the loop"?
I have tried setting the persistence setting lower but some of the queries
take a while to execute and the user ends up going to another application
server which kills the session. We have spoken to the developers about
changing the application so persistence is not required but I am reluctant
to levy that requirement on them just because of LVS possible limitations.
I hope this make some sense....
Thanks for any assistance.
Please read the documentation before posting - it's available at:
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users