ilclaudio@xxxxxxxxxxx wrote:
>
> Yes, during our tests we?ve seen that there are some problems on real
> servers.
> When the number of requests redirected to each server exceed a certain
> threshold, the sustained rate of the system suddenly collapses.
> When this happens the CPU utilization of the system is almost 100%, the
> real server is overloaded and it seems waiting for any system resources
> to free themselves.
are these Linux realservers? which kernel?
> We?ve increased the number of per-process file descriptors, the total
> number of file descriptors, the ip port range, and so on?but the problem
> still remains.
> We've found an old Internet-draft: ?Avoiding the TCP_WAIT state at busy
> servers? (T.Faber, J.Touch, W. Yue).
found it. They're modifying the kernel, specifically the client kernel
which isn't going to help you.
> It deals about the problem of the accumulation of connections in the
> ?TCP_WAIT? state and it explains that ?servers that have many TCP
> connections in TIME_WAIT state experience performance degradation, and
> can collapse?.
> In this document they present the results of their tests showing a 50%
> improvement of HTTP connection rate.
> This is why we are trying to reduce the number of connections in that
> state, not only for aesthetics :-)
> I?m not sure this is the problem, but it could be.
people have been operating with highly loaded realservers with LVS for years
and we haven't heard of this problem. So just trying to get a handle on the
matter.
Modifying tcpip parameters is beyond me. Possibly you could just half the
timeout
value in the kernel (don't know where it is, but you should be able to find it).
I have any unsolved problem with a newscache program that sits in TIME_WAIT
rather
than exiting.
Joe
--
Joseph Mack PhD, High Performance Computing & Scientific Visualization
SAIC, Supporting the EPA Research Triangle Park, NC 919-541-0007
Federal Contact - John B. Smith 919-541-1087 - smith.johnb@xxxxxxx
|