Hello Gustavo,
I suspect the clients scheduled for the sorry server never return back
to the cluster, right (only if you use persistency of course)?
That's right.
That's why I first wrote the hprio scheduler (search the list archives).
I'm working on a project for an airline companie.
Some times they post some promotional tickets for a small period of time
(only for passengers that buys on the website can have it) and the
servers go high.
I wrote the server pool implementation for a ticket reseller company
that probably had the same problems as your airline company. Normal
selling activities not needing high end web servers and then from time
to time (in your case promotional tickets, in my case Christina
Aguilera, U2 or Robbie Williams or World Soccer Championship tickets)
peak selling where tickets need to be sold in the first 15 minutes
having tens of thousands of requests per second, plus the illicit
traffic generated by scripters trying to sanction the event. These
peaks, however, do not warrant the acquisition of high-end servers and
on-demand servers cannot be organized/prepared so quickly.
I need to manually limit each server capacity and the remaining
connections need to go to this sorry server.
That's exactly the purpose of my patch, plus you get to see how many
connections (persistent as in session and active/passive connections)
are forwarded to either the normal webservers (so long as they are
within the u_thresh and l_thresh) or the overflow (sorry server) pool.
As soon as one of the RS in the serving pool drops below l_thresh future
connection requests are immediately sent to the service pool again.
I personally believe that the sorry-server feature is a big missing
piece of framework in IPVS, one that is implemented in all commercial
HW load balancers.
We have tried F5 Big-IP for a while and it worked perfectly, but it is
very expensive for us :(
Yep, about USD 20k-30k to have them in a HA-pair.
So for the 2.4 kernel, I have a patch that has been tested extensively
and is running in production for one year now, having survived some hype
events. I don't know if I find time to sit down for a 2.6 version.
Anyway, as has been suggested, you can also try the sorry server of
keepalived, however I'm quite sure that this is not atomically (since
keepalived is user space) and works more like:
while true {
for all RS {
if RS.conns > u_thresh then quiesce RS
if RS.isQuiesced and RS.conns < l_tresh then {
if sorry server active then remove sorry server
set RS.weight to old RS.weight
}
if sum_weight of all RS == 0 then invoke sorry server with weight > 0
}
If this is the case, it will not work for our use cases with high peak
requests, since sessions are not switched to either one service pool
atomically and thus this will result in people being sent to the
overflow pool even though they would have had a legitimate session and
others again get broken pages back, because in midst of the page view
the LB's user space process gets a scheduler call to update its FSM and
so further requests sent for HTTP 1.0 for example will be broken. The
browser hangs on your customers side and your management gets the angry
phone calls of the business users, to whom you had promised B2B access.
This is roughly how I came around to implementing the server overflow
(spillover server, sorry server) functionality for IPVS.
HTH and best regards,
Roberto Nibali, ratz
--
echo
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|