Hi Peter,
Peter Mueller wrote:
You guys have been busy. I'm glad I got to sleep ;)
Good for you, then you can take over then because I'm busy conducting
high speed packet filtering tests and hacking procps :)
Ratz here is choice cuts from IPaddr. Someday you will have to mail me an
Ouch, who wrote that? And why is this person not using ip? :)
explanation of why /proc/slabinfo is so useful and uh.. what it is. It
makes me want to BBQ.
Ok, since I've had a hard time reading the code and then finding out
that slabinfo(5) is just about what I wanted, I give it a whirl:
Let's start with the man page:
DESCRIPTION
Frequently used objects in the Linux kernel (buffer heads,
inodes, dentries, etc.) have their own cache. The file
/proc/slabinfo gives statistics. For example:
% cat /proc/slabinfo
slabinfo - version: 1.1
kmem_cache 60 78 100 2 2 1
blkdev_requests 5120 5120 96 128 128 1
mnt_cache 20 40 96 1 1 1
inode_cache 7005 14792 480 1598 1849 1
dentry_cache 5469 5880 128 183 196 1
filp 726 760 96 19 19 1
buffer_head 67131 71240 96 1776 1781 1
vm_area_struct 1204 1652 64 23 28 1
...
size-8192 1 17 8192 1 17 2
size-4096 41 73 4096 41 73 1
...
For each slab cache, the cache name, the number of cur-
rently active objects, the total number of available
objects, the size of each object in bytes, the number of
pages with at least one active object, the total number of
allocated pages, and the number of pages per slab are
given.
Ok, what does it mean for us easy people? Let's say I would like to know
the memory usage of a nifty new packet filter tool like nf-hipac
(www.hipac.org). I suspect a packet filter rule entry to be 64 bytes
because I've read the struct my_cool_fw_packet {};. Good so what do I
do? I load a few rules, let's say 1000 and check the before and the
after status:
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 123 4130 64 3 70 1
bloodyhell:/var/FWTEST/nf-hipac # cat hipac.rules_1000 | sh
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 2123 4130 64 36 70 1
bloodyhell:/var/FWTEST/nf-hipac #
Ok, as you can see, the amount of size-64 slabs (cache objects with size
64 bytes has luckily increased from 123 to 2123. Hmmm, strange, let's do
it again:
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 2123 4130 64 36 70 1
bloodyhell:/var/FWTEST/nf-hipac # cat hipac.rules_1000 | sh
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 3123 4130 64 53 70 1
bloodyhell:/var/FWTEST/nf-hipac #
Aha, now we have the 1000 slab objects like predicted. Why were it 2000
the first time? After checking the source I found out that they store
the entries in a btree structure. This is of course very nice and for
the first 1000 entries they needed one leaf each thus needed to allocate
the double amount of memory.
Not let's check if the implementation has some obvious kfree() bug :)
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 3123 4130 64 53 70 1
bloodyhell:/var/FWTEST/nf-hipac # nf-hipac -F
bloodyhell:/var/FWTEST/nf-hipac # egrep "size-64 |size-128 |size-256 " <
/proc/slabinfo
size-256 7 15 256 1 1 1
size-128 467 510 128 16 17 1
size-64 123 4130 64 3 70 1
bloodyhell:/var/FWTEST/nf-hipac #
Nope, it doesn't look like. So I'm already very pleased with the
implementation. As you can see the /proc/slabinfo shows you the details
of cached objects in the kernel. If you multiply the second with the
fourth column per row and sum it up, you get the complete memory usage
of the kernel in your system.
To put a header above the /proc/slabinfo for a second:
1) 2) 3) 4) 5) 6) 7)
------------------------------------------------------
[...]
kmem_cache 58 72 108 2 2 1
ip_fib_hash 13 113 32 1 1 1
tcp_tw_bucket 0 40 96 0 1 1
[...]
1) the name of the cached object
2) the amount of active objects
3) the total amount of objects
4) size of an object in bytes
5) the number of pages (a page=4kb on x86) with at least on active obj.
6) the total number if allocated pages (a page=4kb on x86)
7) the number of pages per slab
A page you get with '__get_free_pages(gfp_mask, order);', where
gfp_mask =
o GFP_ATOMIC: __GFP_WAIT=0 && __GFP_IO=0
o GFP_KERNEL: __GFP_WAIT=1 && __GFP_IO=0
o GFP_USER : __GFP_WAIT=1 && __GFP_IO=1
__GFP_WAIT == 1: the kernel is allowed to discard contents of page
frames in order to free memory
__GFP_IO == 1: the kernel is allowed to write pages to disk in order
to free corresponding page frames.
order = this is the power of 2 of the amount of pages that need to be
allocated. Example: if order=2, then 2^2 (2**2) = 4 pages
will be requested, each with the size of 4kb on x86.
Maybe you've seen messages in the kernel like '0-order allocation of ...
failed'. Those are such messages when get_free_page doesn't succeed.
I hope this helps you understanding it a little bit better.
IFCONFIG=/sbin/ifconfig
ROUTE=/sbin/route
SENDARP=$HA_BIN/send_arp
FINDIF=$HA_BIN/findif
USAGE="usage: $0 ip-address {start|stop|status}";
IP=/sbin/ip
find_interface() {
^^^^^^^^^^^^^^^^^^
can't work, probably you cut too less :). Where is this function? I need it.
ip_stop() {
BASEIP=`echo $1 | sed s'%/.*%%'`
BASEIP="$1"
IF=`find_interface $BASEIP`
IF=`find_interface ${BASEIP%/*}`
if
[ -z "$IF" ]
then
: Requested interface not in use
exit 0
fi
if
[ -x $HA_RCDIR/local_giveip ]
then
$HA_RCDIR/local_giveip $*
fi
Ok.
$ROUTE del -host $BASEIP
Why? Drop that thing.
$IFCONFIG $IF down
${IP} link set dev ${IF} down
ha_log "info: IP Address $BASEIP released"
That's actually not what the above command did!
}
ip_start() {
#
# Do we already service this IP address?
#
if
$IFCONFIG | grep "inet addr:$1 " >/dev/null 2>&1
WTF!! Please, whoever wrote this, what about consulting the ifconfig
page? This only shows interfaces which have link state up. You can have
interfaces with link state down && a defined IP address.
Better:
BASEIP="$1"
IF=$(find_interface ${BASEIP%/*})
tmp=$(${IP} addr show to "${BASEIP}" dev ${IF})
then
exit 0 # We already own this IP address
fi
if
IFINFO=`find_free_interface $1`
Now what is this? There are only but the instantiated physical
interfaces free. I suspect the author counts eth0:1 as an interface too.
Just bloody take one?
then
: OK got interface [$IFINFO] for $1
else
exit 1
fi
Drop it.
IF=`echo "$IFINFO" | cut -f1`
IFEXTRA=`echo "$IFINFO" | cut -f2-`
BASEIP=`echo $1 | sed s'%/.*%%'`
Inconsistent programming, why is BASEIP evaluated so late while in stop
it is evaluated the first?
if
[ -x $HA_RCDIR/local_takeip ]
then
$HA_RCDIR/local_takeip $*
fi
ha_log "info: ifconfig $IF $BASEIP $IFEXTRA"
$IFCONFIG $IF $BASEIP $IFEXTRA
${IP} addr add ${BASEIP} brd + dev ${IF%:*} label ${IF}
${IP} link set dev ${IF%:*} up
$ROUTE add -host $BASEIP dev $IF
Not needed unless you run a 2.0.x kernel!
TARGET_INTERFACE=`echo $IF | sed 's%:.*%%'`
MACADDR=$($IFCONFIG $TARGET_INTERFACE | \
fgrep $TARGET_INTERFACE | \
sed \
's/^.*HWaddr
\(..\):\(..\):\(..\):\(..\):\(..\):\(..\).*$/\1\2\3\4\5\6/')
if [ "${MACADDR:=NULL}" = "NULL" ]; then
ha_log "ERROR: Could not locate obtain hardware address for
$TARGET_INTERFACE"
fi
ha_log "info: Sending Gratuitous Arp for $BASEIP on $IF
[$TARGET_INTERFACE]"
Unfixable but I guess it works.
for j in 1 2 3 4 5
do
$SENDARP $TARGET_INTERFACE ${BASEIP} ${MACADDR} ${BASEIP} ffffffffffff \
|| ha_log "ERROR: Could not send gratuitous arp"
sleep 2
done &
}
Ok, this might work.
Best regards,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc
|