LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [PATCH 2.4] add per real server threshold limitation against ipvsadm

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [PATCH 2.4] add per real server threshold limitation against ipvsadm-1.21-11
From: Roberto Nibali <ratz@xxxxxx>
Date: Tue, 08 Nov 2005 10:21:22 +0100
Hello,

New patches are on their way. I have stress tested this so much, that we
feel confident to use it in production. Some small fixes regarding
available destination counting have been addressed.

>  #define IP_VS_DEST_F_AVAILABLE        0x0001    /* Available tag */
> +#define IP_VS_DEST_F_OVERLOAD         0x0002    /* server is overloaded */
> +#define IP_VS_DEST_F_OVERFLOW         0x0004    /* RS is overflow server */
> +#define IP_VS_DEST_F_PERSISTENT       0x0008    /* RS is overflow server

This will be implemented shortly (IP_VS_DEST_F_PERSISTENT, that is).

> +
> +     /* Update connection counters */
> +     if (!(cp->flags & IP_VS_CONN_F_TEMPLATE)) {
> +             /* It is a normal connection, so increase the inactive
> +                connection counter because it is in TCP SYNRECV
> +                state (inactive) or other protocol inacive state */
> +             atomic_inc(&dest->inactconns);
> +     } else {
> +             /* It is a persistent connection/template, so increase
> +                the peristent connection counter */
> +             atomic_inc(&dest->persistconns);
> +     }
> +
> +     IP_VS_DBG(3, "Bind-dest: Threshold handling: avail_dests=%d\n",
> +                     atomic_read(&dest->svc->avail_dests));
> +     if (dest->u_threshold != 0 &&
> +         ip_vs_dest_totalconns(dest) >= dest->u_threshold) {
> +             dest->flags |= IP_VS_DEST_F_OVERLOAD;
> +             if (atomic_dec_and_test(&dest->svc->avail_dests)) {
> +                     /* All RS for this service are overloaded */
> +                     dest->svc->flags |= IP_VS_SVC_F_OVERLOAD;
> +             }
> +     }
>  }

... and I still got it wrong :). Of course we need to check if
IP_VS_DEST_F_OVERLOAD is already set, or else we have a
dest->svc->avail_dests smashing fest.

> +     } else {
> +             /* It is a persistent connection/template, so decrease
> +                the peristent connection counter */
> +             atomic_dec(&dest->persistconns);
> +     }
> +
> +     IP_VS_DBG(3, "Unbind-dest: Threshold handling: avail_dests=%d\n",
> +                     atomic_read(&dest->svc->avail_dests));
> +     if (dest->l_threshold != 0) {
> +             /* This implies that the upper threshold is != 0 as well */
> +             if (ip_vs_dest_totalconns(dest) <= dest->l_threshold) {
> +                     dest->flags &= ~IP_VS_DEST_F_OVERLOAD;
> +                     atomic_inc(&dest->svc->avail_dests);
> +                     dest->svc->flags &= ~IP_VS_SVC_F_OVERLOAD;
> +             }
> +     } else {
> +             /* We drop in here if the upper threshold is != 0 and the
> +                lower threshold is ==0. */
> +             if (dest->flags & IP_VS_DEST_F_OVERLOAD) {
> +                     dest->flags &= ~IP_VS_DEST_F_OVERLOAD;
> +                     atomic_inc(&dest->svc->avail_dests);
> +                     dest->svc->flags &= ~IP_VS_SVC_F_OVERLOAD;
> +             }
>       }

Same smashing as above, needs check for !IP_VS_DEST_F_OVERLOAD. Also I
wonder if the second part of the if-block is really needed. It's
actually wasted cycles and a relict of the 2.6.x code which does the
threshold handling in a really weird way.

> -
> -             /*
> -              * Simply decrease the refcnt of the template,
> -              * don't restart its timer.
> -              */
> -             atomic_dec(&ct->refcnt);
> +             __ip_vs_conn_put(ct);
>               return 0;

I would like to push this cleanup to Marcelo with the next batch of
2.4.x stuff.

>       }
>       return 1;
> @@ -1270,7 +1318,7 @@
>       ip_vs_conn_hash(cp);
>  
>    expire_later:
> -     IP_VS_DBG(7, "delayed: refcnt-1=%d conn.n_control=%d\n",
> +     IP_VS_DBG(7, "delayed: conn->refcnt-1=%d conn.n_control=%d\n",
>                 atomic_read(&cp->refcnt)-1,
>                 atomic_read(&cp->n_control));

We should really enhance the debugging, so I hope to find some time to
prepare cleanup for 2.6.x and 2.4.x.

>       /*
> -      *    Increase the inactive connection counter
> -      *    because it is in Syn-Received
> -      *    state (inactive) when the connection is created.
> -      */
> -     atomic_inc(&dest->inactconns);
> -
> -     /*
>        *    Add its control
>        */
>       ip_vs_control_add(cp, ct);
> @@ -369,14 +362,8 @@
>       if (cp == NULL)
>               return NULL;
>  
> -     /*
> -      *    Increase the inactive connection counter because it is in
> -      *    Syn-Received state (inactive) when the connection is created.
> -      */
> -     atomic_inc(&dest->inactconns);
> -

Those changes have been tested wildly and also Julian ACKs them, so I'd
like to push those for 2.4.x inclusion as well.

>               if (sysctl_ip_vs_expire_nodest_conn) {
>                       /* try to expire the connection immediately */
>                       ip_vs_conn_expire_now(cp);
> -             } else {
> -                     /* don't restart its timer, and silently
> -                        drop the packet. */
> -                     __ip_vs_conn_put(cp);
>               }
> +             /* don't restart its timer, and silently
> +                drop the packet. */
> +             __ip_vs_conn_put(cp);
>               return NF_DROP;

This has been tested thoroughly and was also sent to Marcelo, who will
include it any of these days.

Updates will follow.

Best regards,
Roberto Nibali, ratz
-- 
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com             fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG                       Wir sichern Ihren Erfolg
-------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>