Hello,
Thank you for your comments.
On Sat, Mar 24, 2018 at 01:26:04PM +0200, Julian Anastasov wrote:
>
> Hello,
>
> Only few notes below. If all goes well, v4 can be
> the last version... :)
>
> On Wed, 21 Mar 2018, Inju Song wrote:
>
> > Implements the Google's Maglev hashing algorithm as a IPVS scheduler.
> >
> > Basically it provides consistent hashing but offers some special
> > features about disruption and load balancing.
> >
> > 1) minimal disruption: when the set of destinations changes,
> > a connection will likely be sent to the same destination
> > as it was before.
> >
> > 2) load balancing: each destination will receive an almost
> > equal number of connections.
> >
> > Seel also for detail: [3.4 Consistent Hasing] in
> > https://www.usenix.org/system/files/conference/nsdi16/nsdi16-paper-eisenbud.pdf
>
> Above commit message is properly indented but comments in
> patches 1 and 3 should not be indented, there should not be
> spaces at left.
>
ok. I will remove indentions in patch 1 and 3.
> > Signed-off-by: Inju Song <inju.song@xxxxxxxxxxxxx>
> > ---
> > net/netfilter/ipvs/ip_vs_mh.c | 532
> > ++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 532 insertions(+)
> > create mode 100644 net/netfilter/ipvs/ip_vs_mh.c
> >
> > diff --git a/net/netfilter/ipvs/ip_vs_mh.c b/net/netfilter/ipvs/ip_vs_mh.c
> > new file mode 100644
> > index 0000000..e531100
> > --- /dev/null
> > +++ b/net/netfilter/ipvs/ip_vs_mh.c
> > @@ -0,0 +1,532 @@
>
> > + IP_VS_DBG_BUF(6,
> > + "MH: selected unavailable server %s:%d (offset
> > %d), reselecting",
> > + IP_VS_DBG_ADDR(dest->af, &dest->addr),
> > + ntohs(dest->port), roffset);
>
> As some vars are unsigned int, make sure all %d in
> debug messages are changed to %u, for consistency. This
> also includes all ports. I guess, these %d come from the
> SH scheduler.
>
Yes, these printing format was from SH scheduler..
I will fix these properly.
> > +static void ip_vs_mh_state_free(struct rcu_head *head)
> > +{
> > + struct ip_vs_mh_state *s;
> > +
> > + s = container_of(head, struct ip_vs_mh_state, rcu_head);
> > + kfree(s->lookup);
> > + kfree(s);
> > +}
> > +
> > +static int ip_vs_mh_init_svc(struct ip_vs_service *svc)
> > +{
> > + struct ip_vs_mh_state *s;
> > +
> > + /* allocate the MH table for this service */
> > + s = kzalloc(sizeof(*s), GFP_KERNEL);
> > + if (!s)
> > + return -ENOMEM;
> > +
> > + s->lookup = kcalloc(IP_VS_MH_TAB_SIZE, sizeof(struct ip_vs_mh_lookup),
> > + GFP_KERNEL);
> > + if (!s->lookup) {
> > + kfree(s);
> > + return -ENOMEM;
> > + }
> > +
> > + generate_hash_secret(&s->hash1, &s->hash2);
> > + s->gcd = ip_vs_mh_gcd_weight(svc);
> > + s->rshift = ip_vs_mh_shift_weight(svc, s->gcd);
> > +
> > + svc->sched_data = s;
> > + IP_VS_DBG(6,
> > + "MH lookup table (memory=%zdbytes) allocated for current
> > service\n",
> > + sizeof(struct ip_vs_mh_lookup) * IP_VS_MH_TAB_SIZE);
> > +
> > + /* permutate & populate with current dests */
> > + return ip_vs_mh_reassign(s, svc);
>
> Looks like we must not set svc->sched_data on error.
> We can do it like this:
>
> int ret;
>
> ...
> IP_VS_DBG()
> ret = ip_vs_mh_reassign(s, svc);
> if (ret < 0) {
> ip_vs_mh_reset(s); /* This is here for completeness */
> ip_vs_mh_state_free(&s->rcu_head);
> return ret;
> }
> /* No more failures, attach state */
> svc->sched_data = s;
> return 0;
>
> It should be ok because we use svc->sched_data only
> from entry points, the scheduler does not use it internally.
>
OK. Above case will be handled.
> > +}
> > +
> > +static void ip_vs_mh_done_svc(struct ip_vs_service *svc)
> > +{
> > + struct ip_vs_mh_state *s = svc->sched_data;
> > +
> > + /* got to clean up hash buckets here */
> > + ip_vs_mh_reset(s);
> > +
> > + /* release the table itself */
> > + ip_vs_mh_state_free(&s->rcu_head);
>
> Above should be:
>
> call_rcu(&s->rcu_head, ip_vs_mh_state_free);
>
> As result, the RCU callback will be called after
> RCU grace period giving chance to RCU readers to stop
> using the svc->sched_data state.
>
Ok, I missed adding to schedule rcu callback for
freeing the state. It will be fixed. Thanks.
> > + IP_VS_DBG(6, "MH lookup table (memory=%zdbytes) released\n",
> > + sizeof(struct ip_vs_mh_lookup) * IP_VS_MH_TAB_SIZE);
>
Regards
--
Inju Song
NAVER Corporation
--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
|