On Thu, May 03, 2018 at 10:02:18PM +0300, Julian Anastasov wrote:
> Local clients are not properly synchronized on 32-bit CPUs when
> updating stats (3.10+). Now it is possible estimation_timer (timer),
> a stats reader, to interrupt the local client in the middle of
> write_seqcount_{begin,end} sequence leading to loop (DEADLOCK).
> The same interrupt can happen from received packet (SoftIRQ)
> which updates the same per-CPU stats.
>
> Fix it by disabling BH while updating stats.
>
> Found with debug:
>
> WARNING: inconsistent lock state
> 4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted
> --------------------------------
> inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
> ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes:
> 86845479 (&syncp->seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs]
> {IN-SOFTIRQ-R} state was registered at:
> lock_acquire+0x44/0x5b
> estimation_timer+0x1b3/0x341 [ip_vs]
> call_timer_fn+0x54/0xcd
> run_timer_softirq+0x10c/0x12b
> __do_softirq+0xc1/0x1a9
> do_softirq_own_stack+0x1d/0x23
> irq_exit+0x4a/0x64
> smp_apic_timer_interrupt+0x63/0x71
> apic_timer_interrupt+0x3a/0x40
> default_idle+0xa/0xc
> arch_cpu_idle+0x9/0xb
> default_idle_call+0x21/0x23
> do_idle+0xa0/0x167
> cpu_startup_entry+0x19/0x1b
> start_secondary+0x133/0x182
> startup_32_smp+0x164/0x168
> irq event stamp: 42213
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&syncp->seq#6);
> <Interrupt>
> lock(&syncp->seq#6);
>
> *** DEADLOCK ***
>
> Fixes: ac69269a45e8 ("ipvs: do not disable bh for long time")
> Signed-off-by: Julian Anastasov <ja@xxxxxx>
Acked-by: Simon Horman <horms@xxxxxxxxxxxx>
Pablo, can you take this into nf?
--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
|