I have been having some problems restarting apache on servers that are
using LVS-NAT and was hoping someone had some insight or a workaround.
Basically, when I make a configuration change to my webservers and I try
to restart them (either with a complete shutdown or even just a graceful
restart), Apache tries to close all the current connections and re-bind
to the port. The problem is that invariably it takes several minutes for
all the current connections to clear even if I kill apache, and the
server won't start as long as any socket is open on port 80, even if it
is in a 'CLOSING' state.
I'm guessing that my problem is that I am using LVS persistent
connections, and combined with apache's lingering close this makes it
difficult for apache to know the difference between a slow connection
and a dead connection when it tries to close down, so the time it takes
to clear some of the sockets approaches my LVS persistence time.
I haven't tried turning off persistence, and I haven't tried
re-compiling apache without lingering-close. This is a production
cluster with rather heavy traffic and I don't have a test cluster to
play with. In the end rebooting the machine has been faster than waiting
for the ports to clear so I can restart apache, but this seems really
dumb, and doesn't work well because then my cluster machines have
different configuration states.
Is there any way anyone knows of to kill the sockets on the webserver
other than simply wait for them to clear out or rebooting the machine?
(I tried also taking the interface down and bringing it up again ...
that didn't work either.)
Is there any way to 'reset' the MASQ table on the LVS machine to force a
reset?
thanks in advance?
thornton
|