LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

Re: [PATCH net-next] net: ipvs: random start for RR scheduler

To: menglong8.dong@xxxxxxxxx
Subject: Re: [PATCH net-next] net: ipvs: random start for RR scheduler
Cc: Simon Horman <horms@xxxxxxxxxxxx>, pablo@xxxxxxxxxxxxx, netdev@xxxxxxxxxxxxxxx, lvs-devel@xxxxxxxxxxxxxxx, linux-kernel <linux-kernel@xxxxxxxxxxxxxxx>, netfilter-devel@xxxxxxxxxxxxxxx, Menglong Dong <imagedong@xxxxxxxxxxx>
From: Julian Anastasov <ja@xxxxxx>
Date: Mon, 9 May 2022 21:17:25 +0300 (EEST)
        Hello,

On Mon, 9 May 2022, menglong8.dong@xxxxxxxxx wrote:

> From: Menglong Dong <imagedong@xxxxxxxxxxx>
> 
> For now, the start of the RR scheduler is in the order of dest
> service added, it will result in imbalance if the load balance

        ...order of added destinations,...

> is done in client side and long connect is used.

        ..."long connections are used". Is this a case where
small number of connections are used? And the two connections
relatively overload the real servers?

> For example, we have client1, client2, ..., client5 and real service
> service1, service2, service3. All clients have the same ipvs config,
> and each of them will create 2 long TCP connect to the virtual
> service. Therefore, all the clients will connect to service1 and
> service2, leaving service3 free.

        You mean, there are many IPVS directors with same
config and each director gets 2 connections? Third connection
will get real server #3, right ? Also, are the clients local
to the director?

> Fix this by randomize the start of dest service to RR scheduler when

        ..."randomizing the starting destination when"

> IP_VS_SVC_F_SCHED_RR_RANDOM is set.
> 
> Signed-off-by: Menglong Dong <imagedong@xxxxxxxxxxx>
> ---
>  include/uapi/linux/ip_vs.h    |  2 ++
>  net/netfilter/ipvs/ip_vs_rr.c | 25 ++++++++++++++++++++++++-
>  2 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
> index 4102ddcb4e14..7f74bafd3211 100644
> --- a/include/uapi/linux/ip_vs.h
> +++ b/include/uapi/linux/ip_vs.h
> @@ -28,6 +28,8 @@
>  #define IP_VS_SVC_F_SCHED_SH_FALLBACK        IP_VS_SVC_F_SCHED1 /* SH 
> fallback */
>  #define IP_VS_SVC_F_SCHED_SH_PORT    IP_VS_SVC_F_SCHED2 /* SH use port */
>  
> +#define IP_VS_SVC_F_SCHED_RR_RANDOM  IP_VS_SVC_F_SCHED1 /* random start */
> +
>  /*
>   *      Destination Server Flags
>   */
> diff --git a/net/netfilter/ipvs/ip_vs_rr.c b/net/netfilter/ipvs/ip_vs_rr.c
> index 38495c6f6c7c..e309d97bdd08 100644
> --- a/net/netfilter/ipvs/ip_vs_rr.c
> +++ b/net/netfilter/ipvs/ip_vs_rr.c
> @@ -22,13 +22,36 @@
>  
>  #include <net/ip_vs.h>
>  
> +static void ip_vs_rr_random_start(struct ip_vs_service *svc)
> +{
> +     struct list_head *cur;
> +     u32 start;
> +
> +     if (!(svc->flags | IP_VS_SVC_F_SCHED_RR_RANDOM) ||

        | -> &

> +         svc->num_dests <= 1)
> +             return;
> +
> +     spin_lock_bh(&svc->sched_lock);
> +     start = get_random_u32() % svc->num_dests;

        May be prandom is more appropriate for non-crypto purposes. 
Also, not sure if it is a good idea to limit the number of steps,
eg. to 128...

        start = prandom_u32_max(min(svc->num_dests, 128U));

        or just use

        start = prandom_u32_max(svc->num_dests);

        Also, this line can be before the spin_lock_bh.

> +     cur = &svc->destinations;

        cur = svc->sched_data;

        ... and to start from current svc->sched_data because
we are called for every added dest. Better to jump 0..127 steps
ahead, to avoid delay with long lists?

> +     while (start--)
> +             cur = cur->next;
> +     svc->sched_data = cur;
> +     spin_unlock_bh(&svc->sched_lock);
> +}
>  
>  static int ip_vs_rr_init_svc(struct ip_vs_service *svc)
>  {
>       svc->sched_data = &svc->destinations;
> +     ip_vs_rr_random_start(svc);
>       return 0;
>  }
>  
> +static int ip_vs_rr_add_dest(struct ip_vs_service *svc, struct ip_vs_dest 
> *dest)
> +{
> +     ip_vs_rr_random_start(svc);
> +     return 0;
> +}
>  
>  static int ip_vs_rr_del_dest(struct ip_vs_service *svc, struct ip_vs_dest 
> *dest)
>  {
> @@ -104,7 +127,7 @@ static struct ip_vs_scheduler ip_vs_rr_scheduler = {
>       .module =               THIS_MODULE,
>       .n_list =               LIST_HEAD_INIT(ip_vs_rr_scheduler.n_list),
>       .init_service =         ip_vs_rr_init_svc,
> -     .add_dest =             NULL,
> +     .add_dest =             ip_vs_rr_add_dest,
>       .del_dest =             ip_vs_rr_del_dest,
>       .schedule =             ip_vs_rr_schedule,
>  };
> -- 
> 2.36.0

Regards

--
Julian Anastasov <ja@xxxxxx>


<Prev in Thread] Current Thread [Next in Thread>