Hello,
Yes, this is with the cronjob removed. So yes, I think we are fixed until
the problem shows up again. :) Lets keep our fingers crossed!
Yes, well what remains is:
o check if /32 will solve your problems too, without the proc-fs in your
rc.local
o see if it works reliably for 1 week
o update the analysis part I've started so we can put it into the howto
for further reference.
o explain to everyone what we did, so people can do that at home (yes,
I'm living in a free country where kids can do everything at home :))
o fix your other two 'problems'
btw, you asked earlier how often I was running the cronjob that brough eth2
up and down. At first I was running it 2 times an hour, after the last
problem yesterday I changed it to run every minute.
:) What kind of failover/failback solution is that, when the heartbeat
interface is down every minute for 2 seconds? Well, we seemed to have
cured it a bit.
root@director:~# ./diag2.sh
grep cache /proc/slabinfo
kmem_cache 80 80 244 5 5 1 : 252 126
inet_peer_cache 408 1416 64 23 24 1 : 252 126
ip_dst_cache 18990 22140 192 1008 1107 1 : 252 126
^^^^^
Ok, matches the cache entries.
arp_cache 1650 1650 128 55 55 1 : 252 126
As long as it doesn't exceed the threshold (either gc2 or gc3, I'm not
sure anymore) we're safe.
inode_cache 114149 114149 512 16307 16307 1 : 124 62
dentry_cache 116850 116850 128 3895 3895 1 : 252 126
ratz@laphish:~/procps > echo "Usage: $(echo
"(116850*128+114149*512)/1024/1024" | bc) MBytes"
Usage: 70 MBytes
ratz@laphish:~/procps >
names_cache 57 57 4096 57 57 1 : 60 30
fs_cache 228 354 64 6 6 1 : 252 126
files_cache 173 297 416 27 33 1 : 124 62
-------------------------------------------------
ip -o -s route show cache | wc -l
18226
I haven't changed bitmask yet though.
I'm sure you will but let's also hope that this night you can stay with
your wife. In a few hours I will depart to a conference in Luxembourg.
They expect me to tell people a few things, so my replies could once
again get sparce and rare.
Cheers,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc
|