LVS
lvs-devel
Google
 
Web LinuxVirtualServer.org

[PATCH 0/4] sysctl: Remove sentinel elements from networking

To: "David S. Miller" <davem@xxxxxxxxxxxxx>, Eric Dumazet <edumazet@xxxxxxxxxx>, Jakub Kicinski <kuba@xxxxxxxxxx>, Paolo Abeni <pabeni@xxxxxxxxxx>, Alexander Aring <alex.aring@xxxxxxxxx>, Stefan Schmidt <stefan@xxxxxxxxxxxxxxxxxx>, Miquel Raynal <miquel.raynal@xxxxxxxxxxx>, David Ahern <dsahern@xxxxxxxxxx>, Steffen Klassert <steffen.klassert@xxxxxxxxxxx>, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>, Matthieu Baerts <matttbe@xxxxxxxxxx>, Mat Martineau <martineau@xxxxxxxxxx>, Geliang Tang <geliang@xxxxxxxxxx>, Ralf Baechle <ralf@xxxxxxxxxxxxxx>, Remi Denis-Courmont <courmisch@xxxxxxxxx>, Allison Henderson <allison.henderson@xxxxxxxxxx>, David Howells <dhowells@xxxxxxxxxx>, Marc Dionne <marc.dionne@xxxxxxxxxxxx>, Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>, Xin Long <lucien.xin@xxxxxxxxx>, Wenjia Zhang <wenjia@xxxxxxxxxxxxx>, Jan Karcher <jaka@xxxxxxxxxxxxx>, "D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx>, Tony Lu <tonylu@xxxxxxxxxxxxxxxxx>, Wen Gu <guwen@xxxxxxxxxxxxxxxxx>, Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>, Anna Schumaker <anna@xxxxxxxxxx>, Chuck Lever <chuck.lever@xxxxxxxxxx>, Jeff Layton <jlayton@xxxxxxxxxx>, Neil Brown <neilb@xxxxxxx>, Olga Kornievskaia <kolga@xxxxxxxxxx>, Dai Ngo <Dai.Ngo@xxxxxxxxxx>, Tom Talpey <tom@xxxxxxxxxx>, Jon Maloy <jmaloy@xxxxxxxxxx>, Ying Xue <ying.xue@xxxxxxxxxxxxx>, Martin Schiller <ms@xxxxxxxxxx>, Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>, Jozsef Kadlecsik <kadlec@xxxxxxxxxxxxx>, Florian Westphal <fw@xxxxxxxxx>, Roopa Prabhu <roopa@xxxxxxxxxx>, Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>, Simon Horman <horms@xxxxxxxxxxxx>, Julian Anastasov <ja@xxxxxx>, Joerg Reuter <jreuter@xxxxxxxx>, Luis Chamberlain <mcgrof@xxxxxxxxxx>, Kees Cook <keescook@xxxxxxxxxxxx>
Subject: [PATCH 0/4] sysctl: Remove sentinel elements from networking
Cc: netdev@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, dccp@xxxxxxxxxxxxxxx, linux-wpan@xxxxxxxxxxxxxxx, mptcp@xxxxxxxxxxxxxxx, linux-hams@xxxxxxxxxxxxxxx, linux-rdma@xxxxxxxxxxxxxxx, rds-devel@xxxxxxxxxxxxxx, linux-afs@xxxxxxxxxxxxxxxxxxx, linux-sctp@xxxxxxxxxxxxxxx, linux-s390@xxxxxxxxxxxxxxx, linux-nfs@xxxxxxxxxxxxxxx, tipc-discussion@xxxxxxxxxxxxxxxxxxxxx, linux-x25@xxxxxxxxxxxxxxx, netfilter-devel@xxxxxxxxxxxxxxx, coreteam@xxxxxxxxxxxxx, bridge@xxxxxxxxxxxxxxx, lvs-devel@xxxxxxxxxxxxxxx, Joel Granados <j.granados@xxxxxxxxxxx>
From: Joel Granados via B4 Relay <devnull+j.granados.samsung.com@xxxxxxxxxx>
Date: Thu, 14 Mar 2024 20:20:40 +0100
From: Joel Granados <j.granados@xxxxxxxxxxx>

What?
These commits remove the sentinel element (last empty element) from the
sysctl arrays of all the files under the "net/" directory that register
a sysctl array. The merging of the preparation patches [4] to mainline
allows us to just remove sentinel elements without changing behavior.
This is safe because the sysctl registration code (register_sysctl() and
friends) use the array size in addition to checking for a sentinel [1].

Why?
By removing the sysctl sentinel elements we avoid kernel bloat as
ctl_table arrays get moved out of kernel/sysctl.c into their own
respective subsystems. This move was started long ago to avoid merge
conflicts; the sentinel removal bit came after Mathew Wilcox suggested
it to avoid bloating the kernel by one element as arrays moved out. This
patchset will reduce the overall build time size of the kernel and run
time memory bloat by about ~64 bytes per declared ctl_table array (more
info here [5]).

When are we done?
There are 4 patchest (25 commits [2]) that are still outstanding to
completely remove the sentinels: files under "net/" (this patchset),
files under "kernel/" dir, misc dirs (files under mm/ security/ and
others) and the final set that removes the unneeded check for ->procname
== NULL.

Testing:
* Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh)
* Ran this through 0-day with no errors or warnings

Savings in vmlinux:
  A total of 64 bytes per sentinel is saved after removal; I measured in
  x86_64 to give an idea of the aggregated savings. The actual savings
  will depend on individual kernel configuration.
    * bloat-o-meter
        - The "yesall" config saves 3976 bytes (bloat-o-meter output [6])
        - A reduced config [3] saves 1263 bytes (bloat-o-meter output [7])

Savings in allocated memory:
  None in this set but will occur when the superfluous allocations are
  removed from proc_sysctl.c. I include it here for context. The
  estimated savings during boot for config [3] are 6272 bytes. See [8]
  for how to measure it.

Comments/feedback greatly appreciated

Best
Joel

[1] https://lore.kernel.org/all/20230809105006.1198165-1-j.granados@xxxxxxxxxxx/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/tag/?h=sysctl_remove_empty_elem_v5
[3] https://gist.github.com/Joelgranados/feaca7af5537156ca9b73aeaec093171
[4] https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@xxxxxxxxxxxxxxxxxxxxxx/

[5]
Links Related to the ctl_table sentinel removal:
* Good summaries from Luis:
  https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@xxxxxxxxxxxxxxxxxxxxxx/
  https://lore.kernel.org/all/ZMFizKFkVxUFtSqa@xxxxxxxxxxxxxxxxxxxxxx/
* Patches adjusting sysctl register calls:
  https://lore.kernel.org/all/20230302204612.782387-1-mcgrof@xxxxxxxxxx/
  https://lore.kernel.org/all/20230302202826.776286-1-mcgrof@xxxxxxxxxx/
* Discussions about expectations and approach
  https://lore.kernel.org/all/20230321130908.6972-1-frank.li@xxxxxxxx
  https://lore.kernel.org/all/20220220060626.15885-1-tangmeng@xxxxxxxxxxxxx

[6]
add/remove: 0/1 grow/shrink: 2/67 up/down: 76/-4052 (-3976)
Function                                     old     new   delta
llc_sysctl_init                              306     377     +71
nf_log_net_init                              866     871      +5
sysctl_core_net_init                         375     366      -9
lowpan_frags_init_net                        618     598     -20
ip_vs_control_net_init_sysctl               2446    2422     -24
sysctl_route_net_init                        521     493     -28
__addrconf_sysctl_register                   678     650     -28
xfrm_sysctl_init                             405     374     -31
mpls_net_init                                367     334     -33
sctp_sysctl_net_register                     386     346     -40
__ip_vs_lblcr_init                           546     501     -45
__ip_vs_lblc_init                            546     501     -45
neigh_sysctl_register                       1011     958     -53
mpls_dev_sysctl_register                     475     419     -56
ipv6_route_sysctl_init                       450     394     -56
xs_tunables_table                            448     384     -64
xr_tunables_table                            448     384     -64
xfrm_table                                   320     256     -64
xfrm6_policy_table                           128      64     -64
xfrm4_policy_table                           128      64     -64
x25_table                                    448     384     -64
vs_vars                                     1984    1920     -64
unix_table                                   128      64     -64
tipc_table                                   448     384     -64
svcrdma_parm_table                           832     768     -64
smc_table                                    512     448     -64
sctp_table                                   256     192     -64
sctp_net_table                              2304    2240     -64
rxrpc_sysctl_table                           704     640     -64
rose_table                                   704     640     -64
rds_tcp_sysctl_table                         192     128     -64
rds_sysctl_rds_table                         384     320     -64
rds_ib_sysctl_table                          384     320     -64
phonet_table                                 128      64     -64
nr_table                                     832     768     -64
nf_log_sysctl_table                          768     704     -64
nf_log_sysctl_ftable                         128      64     -64
nf_ct_sysctl_table                          3200    3136     -64
nf_ct_netfilter_table                        128      64     -64
nf_ct_frag6_sysctl_table                     256     192     -64
netns_core_table                             320     256     -64
net_core_table                              2176    2112     -64
neigh_sysctl_template                       1416    1352     -64
mptcp_sysctl_table                           576     512     -64
mpls_dev_table                               128      64     -64
lowpan_frags_ns_ctl_table                    256     192     -64
lowpan_frags_ctl_table                       128      64     -64
llc_station_table                             64       -     -64
llc2_timeout_table                           320     256     -64
ipv6_table_template                         1344    1280     -64
ipv6_route_table_template                    768     704     -64
ipv6_rotable                                 320     256     -64
ipv6_icmp_table_template                     448     384     -64
ipv4_table                                  1024     960     -64
ipv4_route_table                             832     768     -64
ipv4_route_netns_table                       320     256     -64
ipv4_net_table                              7552    7488     -64
ip6_frags_ns_ctl_table                       256     192     -64
ip6_frags_ctl_table                          128      64     -64
ip4_frags_ns_ctl_table                       320     256     -64
ip4_frags_ctl_table                          128      64     -64
devinet_sysctl                              2184    2120     -64
debug_table                                  384     320     -64
dccp_default_table                           576     512     -64
ctl_forward_entry                            128      64     -64
brnf_table                                   448     384     -64
ax25_param_table                             960     896     -64
atalk_table                                  320     256     -64
addrconf_sysctl                             3904    3840     -64
vs_vars_table                                256     128    -128
Total: Before=440631035, After=440627059, chg -0.00%

[7]
add/remove: 0/0 grow/shrink: 1/22 up/down: 8/-1263 (-1255)
Function                                     old     new   delta
sysctl_route_net_init                        189     197      +8
__addrconf_sysctl_register                   306     294     -12
ipv6_route_sysctl_init                       201     185     -16
neigh_sysctl_register                        385     366     -19
unix_table                                   128      64     -64
netns_core_table                             256     192     -64
net_core_table                              1664    1600     -64
neigh_sysctl_template                       1416    1352     -64
ipv6_table_template                         1344    1280     -64
ipv6_route_table_template                    768     704     -64
ipv6_rotable                                 192     128     -64
ipv6_icmp_table_template                     448     384     -64
ipv4_table                                   768     704     -64
ipv4_route_table                             832     768     -64
ipv4_route_netns_table                       320     256     -64
ipv4_net_table                              7040    6976     -64
ip6_frags_ns_ctl_table                       256     192     -64
ip6_frags_ctl_table                          128      64     -64
ip4_frags_ns_ctl_table                       320     256     -64
ip4_frags_ctl_table                          128      64     -64
devinet_sysctl                              2184    2120     -64
ctl_forward_entry                            128      64     -64
addrconf_sysctl                             3392    3328     -64
Total: Before=8523801, After=8522546, chg -0.01%

[8]
To measure the in memory savings apply this on top of this patchset.

"
diff --git i/fs/proc/proc_sysctl.c w/fs/proc/proc_sysctl.c
index 37cde0efee57..896c498600e8 100644
--- i/fs/proc/proc_sysctl.c
+++ w/fs/proc/proc_sysctl.c
@@ -966,6 +966,7 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set,
        table[0].procname = new_name;
        table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO;
        init_header(&new->header, set->dir.header.root, set, node, table, 1);
+       printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));

        return new;
 }
@@ -1189,6 +1190,7 @@ static struct ctl_table_header *new_links(struct ctl_dir 
*dir, s>
                link_name += len;
                link++;
        }
+       printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table));
        init_header(links, dir->header.root, dir->header.set, node, link_table,
                    head->ctl_table_size);
        links->nreg = nr_entries;
"
and then run the following bash script in the kernel:

accum=0
for n in $(dmesg | grep kzalloc | awk '{print $3}') ; do
    accum=$(calc "$accum + $n")
done
echo $accum

Signed-off-by: Joel Granados <j.granados@xxxxxxxxxxx>

--

---
Joel Granados (4):
      networking: Remove the now superfluous sentinel elements from ctl_table 
array
      netfilter: Remove the now superfluous sentinel elements from ctl_table 
array
      appletalk: Remove the now superfluous sentinel elements from ctl_table 
array
      ax.25: Remove the now superfluous sentinel elements from ctl_table array

 net/appletalk/sysctl_net_atalk.c        | 1 -
 net/ax25/sysctl_net_ax25.c              | 5 ++---
 net/bridge/br_netfilter_hooks.c         | 1 -
 net/core/neighbour.c                    | 5 +----
 net/core/sysctl_net_core.c              | 9 ++++-----
 net/dccp/sysctl.c                       | 2 --
 net/ieee802154/6lowpan/reassembly.c     | 6 +-----
 net/ipv4/devinet.c                      | 5 ++---
 net/ipv4/ip_fragment.c                  | 2 --
 net/ipv4/route.c                        | 8 ++------
 net/ipv4/sysctl_net_ipv4.c              | 7 +++----
 net/ipv4/xfrm4_policy.c                 | 1 -
 net/ipv6/addrconf.c                     | 5 +----
 net/ipv6/icmp.c                         | 1 -
 net/ipv6/netfilter/nf_conntrack_reasm.c | 1 -
 net/ipv6/reassembly.c                   | 2 --
 net/ipv6/route.c                        | 5 -----
 net/ipv6/sysctl_net_ipv6.c              | 4 +---
 net/ipv6/xfrm6_policy.c                 | 1 -
 net/llc/sysctl_net_llc.c                | 8 ++------
 net/mpls/af_mpls.c                      | 3 +--
 net/mptcp/ctrl.c                        | 1 -
 net/netfilter/ipvs/ip_vs_ctl.c          | 5 +----
 net/netfilter/ipvs/ip_vs_lblc.c         | 5 +----
 net/netfilter/ipvs/ip_vs_lblcr.c        | 5 +----
 net/netfilter/nf_conntrack_standalone.c | 6 +-----
 net/netfilter/nf_log.c                  | 3 +--
 net/netrom/sysctl_net_netrom.c          | 1 -
 net/phonet/sysctl.c                     | 1 -
 net/rds/ib_sysctl.c                     | 1 -
 net/rds/sysctl.c                        | 1 -
 net/rds/tcp.c                           | 1 -
 net/rose/sysctl_net_rose.c              | 1 -
 net/rxrpc/sysctl.c                      | 1 -
 net/sctp/sysctl.c                       | 6 +-----
 net/smc/smc_sysctl.c                    | 1 -
 net/sunrpc/sysctl.c                     | 1 -
 net/sunrpc/xprtrdma/svc_rdma.c          | 1 -
 net/sunrpc/xprtrdma/transport.c         | 1 -
 net/sunrpc/xprtsock.c                   | 1 -
 net/tipc/sysctl.c                       | 1 -
 net/unix/sysctl_net_unix.c              | 1 -
 net/x25/sysctl_net_x25.c                | 1 -
 net/xfrm/xfrm_sysctl.c                  | 5 +----
 44 files changed, 27 insertions(+), 106 deletions(-)
---
base-commit: e8f897f4afef0031fe618a8e94127a0934896aba
change-id: 20240311-jag-sysctl_remset_net-d403a1a93d6b

Best regards,
-- 
Joel Granados <j.granados@xxxxxxxxxxx>



<Prev in Thread] Current Thread [Next in Thread>