Hello,
On Thu, 16 Apr 2020, yunhong-cgl jiang wrote:
> Hi, Simon & Julian,
> We noticed that on our kubernetes node utilizing IPVS, the
> estimation_timer() takes very long (>200sm as shown below). Such long delay
> on timer softirq causes long packet latency.
>
> <idle>-0 [007] dNH. 25652945.670814: softirq_raise: vec=1
> [action=TIMER]
> .....
> <idle>-0 [007] .Ns. 25652945.992273: softirq_exit: vec=1
> [action=TIMER]
>
> The long latency is caused by the big service number (>50k) and large
> CPU number (>80 CPUs),
>
> We tried to move the timer function into a kernel thread so that it
> will not block the system and seems solves our problem. Is this the right
> direction? If yes, we will do more testing and send out the RFC patch. If
> not, can you give us some suggestion?
Using kernel thread is a good idea. For this to work, we can
also remove the est_lock and to use RCU for est_list.
The writers ip_vs_start_estimator() and ip_vs_stop_estimator() already
run under common mutex __ip_vs_mutex, so they not need any
synchronization. We need _bh lock usage in estimation_timer().
Let me know if you need any help with the patch.
Regards
--
Julian Anastasov <ja@xxxxxx>
|