LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

Re: [RFC][PATCH v2 19/31] timers: net: Use del_timer_shutdown() before f

To: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [RFC][PATCH v2 19/31] timers: net: Use del_timer_shutdown() before freeing timer
Cc: linux-kernel@xxxxxxxxxxxxxxx, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Stephen Boyd <sboyd@xxxxxxxxxx>, Guenter Roeck <linux@xxxxxxxxxxxx>, Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx>, Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>, "David S. Miller" <davem@xxxxxxxxxxxxx>, Eric Dumazet <edumazet@xxxxxxxxxx>, Jakub Kicinski <kuba@xxxxxxxxxx>, Paolo Abeni <pabeni@xxxxxxxxxx>, Mirko Lindner <mlindner@xxxxxxxxxxx>, Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx>, Martin KaFai Lau <martin.lau@xxxxxxxxxx>, Alexei Starovoitov <ast@xxxxxxxxxx>, Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>, Pavel Begunkov <asml.silence@xxxxxxxxx>, Menglong Dong <imagedong@xxxxxxxxxxx>, linux-usb@xxxxxxxxxxxxxxx, linux-wireless@xxxxxxxxxxxxxxx, bridge@xxxxxxxxxxxxxxxxxxxxxxxxxx, netfilter-devel@xxxxxxxxxxxxxxx, coreteam@xxxxxxxxxxxxx, lvs-devel@xxxxxxxxxxxxxxx, linux-afs@xxxxxxxxxxxxxxxxxxx, linux-nfs@xxxxxxxxxxxxxxx, tipc-discussion@xxxxxxxxxxxxxxxxxxxxx
From: Steven Rostedt <rostedt@xxxxxxxxxxx>
Date: Thu, 27 Oct 2022 17:07:20 -0400
On Thu, 27 Oct 2022 13:48:54 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Thu, Oct 27, 2022 at 1:34 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > What about del_timer_try_shutdown(), that if it removes the timer, it sets
> > the function to NULL (making it equivalent to a successful shutdown),
> > otherwise it does nothing. Allowing the the timer to be rearmed.  
> 
> Sounds sane to me and should work, but as mentioned, I think the
> networking people need to say "yeah" too.
> 
> And maybe that function can also disallow any future re-arming even
> for the case where the timer couldn't be actively removed.

Well, I think this current use case will break if we prevent the timer from
being rearmed or run again if it's not found. But as you said, the
networking folks need to confirm or deny it.

The fact that it does the sock_put() when it removes the timer makes me
think that it can be called again, and we shouldn't prevent that from
happening.

The debug code will let us know too, as it only "frees" it for freeing if
it deactivated the timer and shut it down.

> 
> So any *currently* active timer wouldn't be waited for (either because
> locking may make that a deadlock situation, or simply due to
> performance issues), but at least it would guarantee that no new timer
> activations can happen.
> 
> Because I do like the whole notion of "timer has been shutdown and
> cannot be used as a timer any more without re-initializing it" being a
> real state - even for a timer that may be "currently in flight".
> 
> So this all sounds very worthwhile to me, but I'm not surprised that
> we have code that then knows about all the subtleties of "del_timer()
> might still have a running timer" and actually take advantage of it
> (where "advantage" is likely more of a "deal with the complexities"
> rather than anything really positive ;)

Good to hear. This has been a thorn in our side as we keep hitting these
crashes in the timer code that look like a timer was freed before it
triggered.

> 
> And those existing subtle users might want particular semantics to at
> least make said complexities easier.
> 

Yeah, as someone told me recently, "If you let them play long enough without
setting out the rules, they will take advantage of everything and it will be
extremely hard to get them back in order".

-- Steve


<Prev in Thread] Current Thread [Next in Thread>