On Mon, Oct 31, 2022 at 04:56:43PM +0200, Julian Anastasov wrote:
> Use the provided u64_stats_t type to avoid
> load/store tearing.
>
> Fixes: 316580b69d0a ("u64_stats: provide u64_stats_t type")
> Signed-off-by: Julian Anastasov <ja@xxxxxx>
> ---
> include/net/ip_vs.h | 10 +++++-----
> net/netfilter/ipvs/ip_vs_core.c | 30 +++++++++++++++---------------
> net/netfilter/ipvs/ip_vs_ctl.c | 10 +++++-----
> net/netfilter/ipvs/ip_vs_est.c | 20 ++++++++++----------
> 4 files changed, 35 insertions(+), 35 deletions(-)
>
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index e5582c01a4a3..a4d44138c2a8 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -351,11 +351,11 @@ struct ip_vs_seq {
>
> /* counters per cpu */
> struct ip_vs_counters {
> - __u64 conns; /* connections scheduled */
> - __u64 inpkts; /* incoming packets */
> - __u64 outpkts; /* outgoing packets */
> - __u64 inbytes; /* incoming bytes */
> - __u64 outbytes; /* outgoing bytes */
> + u64_stats_t conns; /* connections scheduled */
> + u64_stats_t inpkts; /* incoming packets */
> + u64_stats_t outpkts; /* outgoing packets */
> + u64_stats_t inbytes; /* incoming bytes */
> + u64_stats_t outbytes; /* outgoing bytes */
> };
> /* Stats per cpu */
> struct ip_vs_cpu_stats {
Converting the per cpu stat to u64_stats_t means that the compiler cannot
optimize the memory access and addition on x86_64. Previously, the summation of
per cpu counters in ip_vs_chain_estimation() looked like:
15b65: add (%rsi),%r14
15b68: add 0x8(%rsi),%r15
15b6c: add 0x10(%rsi),%r13
15b70: add 0x18(%rsi),%r12
15b74: add 0x20(%rsi),%rbp
The u64_stats_read() calls in ip_vs_chain_estimation() turned it into:
159d5: mov (%rcx),%r11
159d8: mov 0x8(%rcx),%r10
159dc: mov 0x10(%rcx),%r9
159e0: mov 0x18(%rcx),%rdi
159e4: mov 0x20(%rcx),%rdx
159e8: add %r11,%r14
159eb: add %r10,%r13
159ee: add %r9,%r12
159f1: add %rdi,%rbp
159f4: add %rdx,%rbx
I guess that is not a big deal because the mov should be the instruction taking
the most time on account of accessing per cpu regions of other CPUs. The add
will be fast.
--
Jiri Wiesner
SUSE Labs
|