Hello,
On Fri, 5 Jan 2001, Thornton Prime wrote:
>
> I have been having some problems restarting apache on servers that are
> using LVS-NAT and was hoping someone had some insight or a workaround.
>
> Basically, when I make a configuration change to my webservers and I try
> to restart them (either with a complete shutdown or even just a graceful
> restart), Apache tries to close all the current connections and re-bind
> to the port. The problem is that invariably it takes several minutes for
> all the current connections to clear even if I kill apache, and the
> server won't start as long as any socket is open on port 80, even if it
> is in a 'CLOSING' state.
Hm, I don't have such problems with Apache. I use the default
configuration-time settings, may be with higher process limit only.
Are you sure you use the latest 2.2 kernels in the real servers?
>
> I'm guessing that my problem is that I am using LVS persistent
> connections, and combined with apache's lingering close this makes it
> difficult for apache to know the difference between a slow connection
> and a dead connection when it tries to close down, so the time it takes
> to clear some of the sockets approaches my LVS persistence time.
>
> I haven't tried turning off persistence, and I haven't tried
> re-compiling apache without lingering-close. This is a production
> cluster with rather heavy traffic and I don't have a test cluster to
> play with. In the end rebooting the machine has been faster than waiting
> for the ports to clear so I can restart apache, but this seems really
> dumb, and doesn't work well because then my cluster machines have
> different configuration states.
One reason your servers to block can be a very low value for
the client number. You can build apache in this way:
CFLAGS=-DHARD_SERVER_LIMIT=2048 ./configure ...
and then to increase MaxClients (up to the above limit). Try with
different values. And don't play too much with the MinSpareServers and
MaxSpareServers. Values near the default are preferred. Is your kernel
compiled with higher value for the number of processes:
/usr/src/linux/include/linux/tasks.h
>
> Is there any way anyone knows of to kill the sockets on the webserver
> other than simply wait for them to clear out or rebooting the machine?
> (I tried also taking the interface down and bringing it up again ...
> that didn't work either.)
>
> Is there any way to 'reset' the MASQ table on the LVS machine to force a
> reset?
No way! The masq follows the TCP protocol and it is transparent
to the both ends. The expiration timeouts in the LVS/MASQ box are high
enough to allow the connection termination to complete. Do you remove
the real servers from the LVS configuration before stopping the apaches?
This can block the traffic and can delay the shutdown. It seems the
fastest way to restart the apache is apachectl graceful, of course,
if you don't change anything in apachectl (in the httpd args).
> thanks in advance?
>
> thornton
Regards
--
Julian Anastasov <ja@xxxxxx>
|