On Fri, Sep 09, 2022 at 01:21:05AM +0300, Julian Anastasov wrote:
> It is interesting to know what value for
> IPVS_EST_TICK_CHAINS to use, it is used for the
> IPVS_EST_MAX_COUNT calculation. We should determine
> it from tests once the loops are in final form.
> Now the limit increased a little bit to 38400.
> Tomorrow I'll check again the patches for possible
> problems.
I couldn't wait so I have run tests on various machines and used the
sched_switch tracepoint to measure the time needed to process one chain. The
table contains a median time for processing one chain, the maximum time
measured, the median divided by the number of CPUs and the time needed to
process one chain if there were 1024 CPUs of that type in a machine:
> NR CPU Time(ms) Max(ms) Time/CPU(ms) 1024
> CPUs(ms)
> 48 Intel Xeon CPU E5-2670 v3, 2 nodes 1.220 1.343 0.025
> 26.027
> 64 Intel Xeon Gold 6326, 2 nodes 0.920 1.494 0.014
> 14.720
> 192 Intel Xeon Gold 6330H, 4 nodes 3.957 4.153 0.021
> 21.104
> 256 AMD EPYC 7713, 2 NUMA nodes 3.927 5.464 0.015
> 15.708
> 80 ARM Neoverse-N1, 1 NUMA node 1.833 2.502 0.023
> 23.462
> 128 ARM Kunpeng 920, 4 NUMA nodes 3.822 4.635 0.030
> 30.576
I have to admit I was hoping the current IPVS_EST_CHAIN_DEPTH would work on
machines with more than 1024 CPUs. If the max time values are used the time
needed to process one chain on a 1024 CPU machine gets even closer to 40 ms,
which it must not reach lest the estimates become inaccurate. I also have
profiling data so I intend to look at the disassembly of
ip_vs_estimation_kthread() to see which instructions take the most time. I will
take a look at the v2 of the code on Monday.
--
Jiri Wiesner
SUSE Labs
|