LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Limit

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Limit
From: Gustavo Mateus <gustavo@xxxxxxxxxxxxxx>
Date: Thu, 23 Nov 2006 08:39:15 -0200
Hello Roberto,

Thanks for your reply, I can see that you know how it is when a tsunami of requests hits your servers and that I'm not alone with this problem :)

I'll study your solution and try it on a mini lab here. If I have difficulties I may disturb you again pvt

Thanks again

[]'s

Gustavo Mateus

Roberto Nibali wrote:
Hello Gustavo,

I suspect the clients scheduled for the sorry server never return back to the cluster, right (only if you use persistency of course)?
That's right.

That's why I first wrote the hprio scheduler (search the list archives).

I'm working on a project for an airline companie.
Some times they post some promotional tickets for a small period of time (only for passengers that buys on the website can have it) and the servers go high.

I wrote the server pool implementation for a ticket reseller company that probably had the same problems as your airline company. Normal selling activities not needing high end web servers and then from time to time (in your case promotional tickets, in my case Christina Aguilera, U2 or Robbie Williams or World Soccer Championship tickets) peak selling where tickets need to be sold in the first 15 minutes having tens of thousands of requests per second, plus the illicit traffic generated by scripters trying to sanction the event. These peaks, however, do not warrant the acquisition of high-end servers and on-demand servers cannot be organized/prepared so quickly.

I need to manually limit each server capacity and the remaining connections need to go to this sorry server.

That's exactly the purpose of my patch, plus you get to see how many connections (persistent as in session and active/passive connections) are forwarded to either the normal webservers (so long as they are within the u_thresh and l_thresh) or the overflow (sorry server) pool. As soon as one of the RS in the serving pool drops below l_thresh future connection requests are immediately sent to the service pool again.

I personally believe that the sorry-server feature is a big missing piece of framework in IPVS, one that is implemented in all commercial HW load balancers.
We have tried F5 Big-IP for a while and it worked perfectly, but it is very expensive for us :(

Yep, about USD 20k-30k to have them in a HA-pair.

So for the 2.4 kernel, I have a patch that has been tested extensively and is running in production for one year now, having survived some hype events. I don't know if I find time to sit down for a 2.6 version. Anyway, as has been suggested, you can also try the sorry server of keepalived, however I'm quite sure that this is not atomically (since keepalived is user space) and works more like:

while true {
  for all RS {
    if RS.conns > u_thresh then quiesce RS
    if RS.isQuiesced and RS.conns < l_tresh then {
       if sorry server active then remove sorry server
       set RS.weight to old RS.weight
  }
  if sum_weight of all RS == 0 then invoke sorry server with weight > 0
}

If this is the case, it will not work for our use cases with high peak requests, since sessions are not switched to either one service pool atomically and thus this will result in people being sent to the overflow pool even though they would have had a legitimate session and others again get broken pages back, because in midst of the page view the LB's user space process gets a scheduler call to update its FSM and so further requests sent for HTTP 1.0 for example will be broken. The browser hangs on your customers side and your management gets the angry phone calls of the business users, to whom you had promised B2B access.

This is roughly how I came around to implementing the server overflow (spillover server, sorry server) functionality for IPVS.

HTH and best regards,
Roberto Nibali, ratz


<Prev in Thread] Current Thread [Next in Thread>