Hello,
On Mon, 3 Jul 2000, Chetan Ahuja wrote:
>
>
> On Sun, 2 Jul 2000, Julian Anastasov wrote:
>
> >
> > Hello,
> >
> > On Sat, 1 Jul 2000, Chetan Ahuja wrote:
>
> > > What exactly constitutes "Load" on a real server and how often should it
> > > be measured?
> >
> > Only the LVS user can decide what is "Load" and what
> > is the preferred period to update its values. The different
> > real services change different parameters in the real hosts.
> > We have to allow the user to select which parameters to be
> > monitored for each real service.
>
> Yes.. I agree that ultimately we would want to let the user decide as
> many parameters of scheduling algorithm as possible. ( More about this
> later..)
>
>
> > > If yes, what kind of things should be measured as "load". The astute
> > > reader would shoot back immediately with stuff like CPU, memory and
> > > network load. Let's treat them one by one:
> >
> > >
> > > (All of the following is assuming that the realservers are runnig linux.
> > > At least I'm going to deal with only such cases for now)
> > >
> > > 1) CPU: How good is polling /proc/loadavg? My problem with that is the
> > > load introduced by the measurement itself if polling is done too often.
> >
> > 5 seconds is not fatal.
>
> I think a once every 5 second sampling of load is probably
> not enough. We'd need more frequent sampling. This is where people
> with real experience in running large LVS clusters comes in. I would
> really LOVE to hear from people who think that they are not entirely
> satisfied with the current schedulers and would like a
> load-informed scheduler. What kind of applications demand such a
> scheme ? Once we have this information, we could decide what
> would be the best sampling period. (And of course, ideally we should
> make it a runtime configurable parameter as I said before)
Yes, complex task. But you can select weights using
averaged parameters. It is difficult to react on client
requests but don't forget about the load generated from
local processes not part from the service. I don't think,
reducing the interval will help. My experince shows that
selecting an expression from averaged values works with
current WRR without stressing some of the real hosts.
And yes, selecting big weights is not good for WRR.
It is possible some of the real servers to be flooded with
requests. We must follow this rule when using WRR:
weight < request_rate * update_interval
So, for 50 req/sec and 5 seconds interval don't set
weight greater than 250. In fact, the problem is in the
different weights. Big differences in the weights compared
to the request rate lead to inbalance. Of course, this is
with current WRR. And I don't see any other better method.
And there is a backlog for any TCP service provided from the
kernel. May be WRR can schedule more requests to one real
server for short period if its weight is big but if the sum
of all weight is less than the number of requests for the
update interval, the requests are scheduled according to the
weights. You will need another scheduling method if you want
to break the above rules and to use little update intervals
reacting to the client requests. Reducing the update
interval in the above formula leads to inbalance when you
use real servers with very different weights.
The result: we need benchmarks from production. The
user can tune its expressions and update interval according
to the load.
>
> > > Besides, how good is the info in loadavg? Doesn't it just
> > > measure the number of processes in the queue every few milliseconds or
> > > so to calculate the load. One could argue (and many people do argue)
> > > that this is not a good metric of CPU load. Any ideas ??
> >
> > Yes, loadavg is not good for web and the other well
> > known services. But the user still can run some real
> > services that eat CPU. If the loadavg can be high the user
> > can select it as load parameter.
>
> So what would be a better way to get CPU load info. I would like
> to use the /proc interface as much as possible and avoid special
> kernel patches. But that's not an absolute requirement. I'm looking for
> suggestions as to the best metric for CPU activity (averaged, say,
> over the past one second). I'm thinking some combination of
/proc/stat is good, there is a cpu load:
user/nice/system/idle. You have to make the averaged values
according to the update interval or may be to another
interval which is good to allow removing the load picks.
>
> a) num. of passes through schedule()
> b) num. of interrupts
> c) num. of processes in run queues
> d) some count of how many processes used up their alloted time quantum
> without sleeping on I/O (might indicate CPU intensive work as
> against I/O intensive work)
>
> Comments please...
Hm, very exotic parameters but may be needed.
For web I use:
- 5 seconds interval
- cpu idle (on all CPUs) in percents averaged from last 5
seconds
- free memory in megabytes
- WRR with weights < 200
- I can build kernel in the web while it is used as real
server. Is this sounds good :) This is not possible for WLC.
I didn't tested ftp but may be expression from:
- number of processes
- network packets/bytes (per interface?)
Any people in production?
>
> > > 2) Memory: We could just do what free command does (which is
> > > just reading /proc/meminfo). Is that good enough. Anybody see any
> > > pitfalls in that approach? Of course, polling /proc too often is again
> > > a problem here. Besides that ??
> >
> > Yes, you can create many load parameters from
> > /proc/meminfo. Even "Cached" and "Buffers" are useful. And
> > sometimes it is faster to open and read from /proc than to
> > to read the parameters one by one by using other interfaces.
>
> Seems like we have all the info we need for memory from /proc/meminfo
Yes
>
> > > 3) Network: This is the hardest one. What would be a good metric of
> > > network load... number of alive TCP connections?? Is that good
> > > enough... I'm not deeply familiar with the kernel networking code. Could
> > > somebody who is more familiar would throw some more light on this....
> >
> > You can try with /proc/net/dev. There are bytes and
> > packets for each interface but the drawback is that they are
> > sometimes zeroed and the interfaces sometimes disappear :)
>
> Well, yes, num. of dropped packets might give us an indication that
> the networking load is heavy (which is a good thing to know if it's
> happening) but the numbers in /dev/net are cumulative numbers since
> the interface was brought up and it may not have any relation to the
> current situation ( but yes we could do some simple math to extract the
> numbers for last second or whatever)
Yes and this is not difficult. Just create parameter
cpuidle5 or numpackets5 and make some calcs to support them.
Some of the values are not useful in raw form. And the world
is not perfect, there are different clients: some of the
server processes can be delayed or they can add different
load. Cpuidle and freemem < 20% are not very good and after
this point we have to add more real servers in the cluster.
> I was thinking more in terms of TCP connection overhead and
> all the costs associated with that... Since most likely
> use of LVS is as web servers, proxies etc., TCP load is probably
> the most important issue here. At least that's what I've come up
> with so far. I am really looking for comments from the "experts" on
> this one...
Hm, measuring the kernel load is very difficult.
And I don't think it is needed. Which TCP parameters you
need. May be some from /proc/net/snmp ? To be honest, it
seems I don't need to react on client's requests. Keeping
the load before 80% allows this. The problem is in the load
generated from processes not part from the service.
Including the cpuidle and freeram in the expression is
mandatory in this case and solves all problems. May be the
net packets too.
>
>
> > The list is a good place for such discussion :)
> >
> > More ideas:
> >
> > - use ioctls to add/delete LVS services/destinations
> >
> > - use all kinds of virtual services, forwarding methods and
> > scheduling methods (configured from the user). IOW, all LVS
> > feauters.
> >
> > - user space tool to manage the config file and the network
> > interfaces/routes/settings. For example:
> >
> > <tool> start <domain> send gratuitous ARP, set ifaces, etc
> > <tool> stop <domain> stop ifaces, etc
> > <tool> secondary <domain> role: director -> real server
> > <tool> primary <domain> role: real server -> director
> >
> > - call scripts to play with policy routing and other kernel
> > settings, etc.
> >
> > - support for backup directors working as real servers
>
>
> These are all nice TODO items but I'm afraid I'll have just
> enought time to focus on only the load-informed scheduling for now.
OK
Yes, the list can grow :) It is preferred the load
monitoring to work together with the virtual/real service
management but may be it is good these modules to be
separated, who knows. We need other opinions. May be from
someone using similar expressions and with other golden
rules :)
>
> Thanks for the considered reply...
> Chetan
Regards
--
Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>
|