Hi,
On Tue, Jun 30, 2020 at 07:10:06PM +0300, Julian Anastasov wrote:
>
> Hello,
>
> On Tue, 30 Jun 2020, Simon Horman wrote:
>
> > > diff --git a/net/netfilter/ipvs/ip_vs_conn.c
> > > b/net/netfilter/ipvs/ip_vs_conn.c
> > > index 02f2f636798d..b3921ae92740 100644
> > > --- a/net/netfilter/ipvs/ip_vs_conn.c
> > > +++ b/net/netfilter/ipvs/ip_vs_conn.c
>
> > > @@ -827,14 +852,17 @@ static void ip_vs_conn_expire(struct timer_list *t)
> > >
> > > /* does anybody control me? */
> > > if (ct) {
> > > + bool has_ref = !cp->timeout && __ip_vs_conn_get(ct);
> > > +
> > > ip_vs_control_del(cp);
> > > /* Drop CTL or non-assured TPL if not used anymore */
> > > - if (!cp->timeout && !atomic_read(&ct->n_control) &&
> > > + if (has_ref && !atomic_read(&ct->n_control) &&
> > > (!(ct->flags & IP_VS_CONN_F_TEMPLATE) ||
> > > !(ct->state & IP_VS_CTPL_S_ASSURED))) {
> > > IP_VS_DBG(4, "drop controlling connection\n");
> > > - ct->timeout = 0;
> > > - ip_vs_conn_expire_now(ct);
> > > + ip_vs_conn_del_put(ct);
> >
> > Previously this code did not put the ct, now it does.
> > Is that intentional.
>
> Yes, as ip_vs_conn_expire() now can be called both in
> timer and process context we need a reference for ct while
> calling del_timer() in ip_vs_conn_del_put(). As ct->n_control
> is 0 after our ip_vs_control_del(), ct can be expired by
> timer while we are trying to del it here.
>
> > > @@ -1341,19 +1368,15 @@ static void ip_vs_conn_flush(struct netns_ipvs
> > > *ipvs)
> > > hlist_for_each_entry_rcu(cp, &ip_vs_conn_tab[idx], c_list) {
> > > if (cp->ipvs != ipvs)
> > > continue;
> > > - /* As timers are expired in LIFO order, restart
> > > - * the timer of controlling connection first, so
> > > - * that it is expired after us.
> > > - */
> > > + if (atomic_read(&cp->n_control))
> > > + continue;
> > > cp_c = cp->control;
> > > - /* cp->control is valid only with reference to cp */
> > > - if (cp_c && __ip_vs_conn_get(cp)) {
> > > + IP_VS_DBG(4, "del connection\n");
> > > + ip_vs_conn_del(cp);
> > > + if (cp_c && !atomic_read(&cp_c->n_control)) {
> > > IP_VS_DBG(4, "del controlling connection\n");
> > > - ip_vs_conn_expire_now(cp_c);
> > > - __ip_vs_conn_put(cp);
> > > + ip_vs_conn_del(cp_c);
> >
> > Conversely, previously this code put the ct, now it doesn't.
> > Is that also intentional?
>
> Now we do not get reference to cp because in RCU
> section the cp structure can not go away. So, we have an
> implicit reference to cp. Same for its cp->control (ct).
> The conn structures are freed later in RCU callback.
>
> In this case we may run del_timer() after
> another CPU, eg. after ip_vs_conn_expire() was already
> called after timer expire or after ip_vs_conn_del*(). But
> for us del_timer will not succeed.
Thanks for the explanation. This now looks good to me.
Reviewed-by: Simon Horman <horms@xxxxxxxxxxxx>
Pablo, could you consider applying this to nf-next?
|