Thanks for the script. I will try to catch it in another 10 hours and run your script against it. The lvs is slowly becoming production, so I do not have a full load yet. Actually the most connections we have seen is 146.
Thanks again.
Mike Radomski
SUNY - ITEC
Information Technology Exchange Center
Systems Programmer/Analyst
E-mail: Mike.Radomski@xxxxxxxxxxxxxxxxxx
Systems E-Mail: scsys@xxxxxxxxxxxxxxxxxx
Phone: (716)878-4832
Cellular: (716)866-7039
Fax: (716)878-4235
-----Original Message-----
From: Paul Lantinga [mailto:prl@xxxxxx]
Sent: Wednesday, March 13, 2002 2:20 PM
To: 'lvs-users@xxxxxxxxxxxxxxxxxxxxxx'
Subject: RE: CPU Spike every 10 hours
> On Wed, 13 Mar 2002, Radomski, Mike wrote:
>
> > Ok,
> > It is now doing it on my primary system. Here is the top output.
> >
> > Mike Radomski
>
> That is just about the goll-darndest thing that I've seen in a while!
Indeed.
Mike, for next steps, I would take a look at what lvs and the host nics
are doing in this period.
Heck, pull out all the tools - check /var/log/messages, check dmesg,
check to see if anything crazy is happening with ipchains or iptables,
see if it happens when you are only logged on the console instead of
over ssh, run chkconfig --list to see if anything bizarro is going on...
For general networking, ifconfig and netstat -s|-a should tell you some
basics. Even better would be to have mrtg poll your lvs server for the
NIC stats.
This is the script that I use on my lvs server to get a feel for what
it's up to:
#!/bin/tcsh
while {1} ;
clear
ipvsadm -L
ipvsadm -L --stats
ipvsadm -L --rate
netstat -i
sleep 2
end
For perspective, when top on my lvs box shows your kind of load...
2:02pm up 11 days, 21:22, 5 users, load average: 0.74, 0.72, 0.55
36 processes: 33 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 0.0% user, 88.6% system, 0.0% nice, 11.3% idle
Mem: 255360K av, 247204K used, 8156K free, 0K shrd, 24152K
buff
Swap: 522072K av, 492K used, 521580K free 104384K
cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
3 root 19 19 0 0 0 RWN 26.8 0.0 837:52
ksoftirqd_CPU0
31950 root 9 0 876 784 700 S 2.9 0.3 36:13 heartbeat
my ipvsadm script shows this:
IP Virtual Server version 1.0.0 (size=524288)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP http-vip:http wlc
-> iis5:http Masq 6 111 27
-> nt4-iis1:http Masq 6 42 1899
-> secondapache:http Masq 6 132 35
-> flog:http Masq 6 59 1376
IP Virtual Server version 1.0.0 (size=524288)
Prot LocalAddress:Port Conns InPkts OutPkts InBytes
OutBytes
-> RemoteAddress:Port
TCP http-vip:http 27669747 1445M 1924M 86295M
2742G
-> iis5:http 49154235 2062M 2945M 132G
4190G
-> nt4-iis1:http 46941318 1814M 2604M 92080M
3639G
-> secondapache:http 49401039 2640M 3656M 168G
5136G
-> flog:http 6785585 330472K 447713K 20611M
643G
IP Virtual Server version 1.0.0 (size=524288)
Prot LocalAddress:Port CPS InPPS OutPPS InBPS
OutBPS
-> RemoteAddress:Port
TCP http-vip:http 319 15524 20768 972668
29612824
-> iis5:http 4 5578 8206 334174
12114227
-> nt4-iis1:http 168 1793 1875 120785
2134719
-> secondapache:http 10 6074 8240 371529
12116796
-> flog:http 137 2079 2447 146181
3247080
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP
TX-OVR Flg
bond0 1500 03109547635 4 0 01856882784 0 0
556 BMmRU
bond0 1500 0 - no statistics available -
BMmRU
eth1 1500 03813245747 4 0 0330898319 0 0
132 BMsU
eth2 1500 01203035158 0 0 03523782935 0 0
114 BMsRU
eth3 1500 02334981058 0 0 03296062471 0 0
307 BMsRU
eth4 1500 053252968 0 0 03296073654 0 0
3 BMsRU
eth6 1500 01887853198 0 0 03111776691 0 0
0 BMRU
eth6: 1500 0 - no statistics available -
BMRU
-paul.
|