Roberto Nibali <ratz@xxxxxx> wrote:
> [-- text/plain, encoding 7bit, charset: ISO-8859-1, 61 lines --]
>
> Hello,
>
> In order to have an atomic failover from a real server pool with all
> services quiesced to a spillover pool one needs to instrument the
> kernel. This problem occurs when you have an application that cannot
> deal properly with short but slashdot-like hypes and you limit the
> destination of the services with the per real server thresholds
> available in 2.6.x (and since I've backported and enhanced it, also in
> 2.4.x) kernels.
>
> It works as RR scheduler and simply selects the server with the highest
> weight which hasn't set the IP_VS_DEST_F_OVERLOAD flag. The modus
> operandi from user space is to invoke the service session pool _and_
> also add the overflow or spillover pool, but of course with a lower weight.
>
> lb-lb0-phys:~# ipvsadm-2.4 -L -n; echo; ipvsadm-2.4 -L -n --thresholds
> IP Virtual Server version 1.0.12 (size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP 112.2.13.20:80 hprio persistent 150
> -> 112.2.13.13:80 Route 1 0 0
> -> 112.2.13.23:80 Route 5 0 0
> -> 112.2.13.22:80 Route 5 0 0
> -> 112.2.13.21:80 Route 5 0 0
>
> IP Virtual Server version 1.0.12 (size=4096)
> Prot LocalAddress:Port Uthreshold Lthreshold ActiveConn InActConn
> -> RemoteAddress:Port
> TCP 112.2.13.20:80 hprio persistent 150
> -> 112.2.13.13:80 0 0 0 0
> -> 112.2.13.23:80 50 20 0 0
> -> 112.2.13.22:80 50 20 0 0
> -> 112.2.13.21:80 50 20 0 0
>
> As it can be spotted easily, ~21, ~22 and ~23 are in the service pool
> with an upper threshold of 50 and a weight of 5, while we have ~13 as
> the only spillover server with weight 1 and no threshold limitation.
> Once service bound on ~20 is "quiesced", meaning that all RS are
> quiesced, the hprio scheduler will automatically and atomically
> (initiated from ip_vs_bind_dest) switch to the overflow server pool, aka
> ~13.
>
> The current implementation of my original 2.2.x patch in the 2.6.x
> kernel is in my view unfinished and can hardly be used in production for
> sites with heaps of page views and business logics in application servers.
>
> Threshold limitation patch and user space patch follow.
>
> Please discuss,
Conceptually it seems like a reasonable idea, though if
you have per-real server thresholds is it really needed?
In terms of code, it looks good, though traversing the
real-serves in ip_vs_hprio_max_weight() and again in
ip_vs_hprio_schedule() seems like it could be merged
into a single loop. But thats not a big deal.
I'm not sure if its really material to to go upstream.
If you think it is, then a 2.6 version will be the way to
go as I believe 2.4 is only accepting bug fixes now.
--
Horms
|