Hi
On Fri, 2008-10-31 at 14:57 -0700, Robinson, Eric wrote:
> > What sort of packet throughput are you getting?
>
> How would you like that measured?
Packets/sec in and packets/sec out on the director is usually a good
bet :)
> > Are you using LVS-DR or LVS-NAT?
>
> LVS-NAT
Right... NAT makes the CPU work harder than DR because, well, it's doing
more work. If that isn't self-evident, say so, and I'll explain further.
> Aside from running heartbeat and ldirectord with 100+ virtual servers,
> not too much. Here's the output from top:
>
> top - 13:43:47 up 81 days, 9:12, 1 user, load average: 1.40, 1.42,
> 1.38
> Tasks: 60 total, 1 running, 59 sleeping, 0 stopped, 0 zombie
> Cpu(s): 46.8% us, 3.0% sy, 0.0% ni, 48.8% id, 0.0% wa, 1.3% hi,
> 0.0% si
> Mem: 516304k total, 506348k used, 9956k free, 45448k buffers
> Swap: 1048568k total, 4k used, 1048564k free, 369656k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2762 root 17 0 13708 9884 1744 S 50.4 1.9 13386:29 ldirectord
Whoa there, horsey!
81 days uptime is 116640 minutes; that means ldirectord has consumed >
10% of the CPU in the time the server's been up. What's the health check
interval here?
With (say) 100 virtual servers, 2 realservers each, an interval of 10
seconds means 200 checks every ten seconds (nominally). Assuming a 0.1
second latency for each check, you're talking overlapping checks there
so a given check thread is only half way through running when it starts
again.
I can see some tuning being required here - or trying to make ldirectord
fork and thread correctly (if it doesn't already). Horms, can you
comment here?
Graeme
|