Hello,
On Fri, 8 Jun 2001, Rief, Jacob wrote:
> I agree that there is nothing the kernel can do, such as the second check.
> But the first one (socket-alive) can be avoided. I try to explain how:
>
>
> typical TCP/IP connection
>
> Time Client Server Comment
> | | |
> v |----- SYN isn1 --------->| initiate connection
> |<- SYN isn2 ACK isn1+1 --| this my not pass through the
> LB
> | |
> ** CHECK HERE: |----- ACK isn2+1 ------->|
> | |
> |<----- ACK, DATA ------->|
> |<----- ACK, DATA ------->|
> .........
> |<----- ACK, DATA ------->|
> | |
> |------ FIN isn3 -------->|
> |<---- ACK isn3+1 ------->|
> |<---- FIN isn4 --------->|
> |------ ACK isn4 -------->|
>
>
> What the LB could do is to check the timediff between "SYN isn1" and "SYN
> isn2 ACK isn1+1".
> But this packet may not return through the LB, when configured as tunnel or
> gate.
> But LVS could check for "ACK isn2+1". If that time expires, You can bet that
> the real server
I assume you want LVS-NAT to remember the last SYN+ACK from
each real server and to start timer which will expire if there are
no client packets with the proper ACK or until another RS SYN+ACK
does not come? And any ACK from the RS is not enough because we need
to be sure that the rs accepts new connections and the established
traffic is not enough we to assume this RS is working? So, it seems
we only need to detect SYN+ACKs from each RS? Or we have to rely on
the client to ACK? If we wait for specific ACK then we remember some
SYN+ACK from the RS? What will happen if this SYN+ACK was dropped
before reaching the client? Or when the client is attacker?
What will trigger we to add again this server? There is no
new traffic to this server? If the clients stop to access the service
and/or there are small number of connections this server can stay
idle enough time we to trigger this failover. Or we can make additional
check before setting the weight to 0? What if we remove the last running
real server (all others are already with weight 0 because some guy from
the uplink ISP just restarted the router/line). We can see that client
requests start to come after big pause but the service is down (all
real servers are stopped). How the user space will detect that there
are client requests before the next app check after 30 seconds?
> is dead. Then You may set its weight to lets say -weight and give the
> responsability to
> the monitoring software, which may readd the realserver after is came back.
>
> I think it should not be too difficult to implement this, or am I wrong?
May be there is a working solution :) Let's check all details :)
In the current implementation, we are even not sure what is the
real service port :))) For example, for fwmark-based services. Yes, this
solution will be partial, only for NAT-ed TCP services?
I think, you even prefer not to add any code in user space. For
example, we can put the rs in mode where we can try on each second to
schedule one request to such stopped service and if we detect RST or
nothing (host died) in period of 1 second (we wait for SYN+ACK) we can
repeat and to send this packet to another running rs. By this way we can
automatically start and stop NAT-ed real servers without breaking
connections. But these probes will delay some of the requests that we
use for probing. We can ignore the ICMPs from the RS related to our
probes, so may be we will not need kernel thread to make these probes.
The checks can be performed in the packet handlers.
So, may be we will need:
- per virtual service flag to activate this game
- one flag per rs, whether the RS is stopped after missing SYN+ACKs in
specified period of time (even when the rs is idle?)
- one slot in the RS structure for the delayed client's SYN packet used
for the probe
- rs field for the time when the last packet was sent to the rs
- when there is no specific reply from the RS (SYN+ACK) or we see some
RSTs (not all) we can put the RS in special mode where we exclude it
from the scheduling (weight=-old_weight) and start to make probes. This
can be triggered even for idle services and there is no problem to stop
idle real servers. May be we can temporary put such disabled RSs back as
normally running if the scheduler can't select active RS. In such cases
when there is no active RS we can try our last chance: with such disabled
RS.
- don't relay on the client to send ACKs
- may be the probes can be packet driven if already there is no
pending probe?
- may be we have to touch too much of the code :(
- the user space still have the right to run apllication specific
checks and to alter the weight. The user space will not see the
weight changes, the kernel always will print the original weight,
i.e. the probes will run in transparent mode. Or the mode can be
visible (not with negative weight may be)?
- something else may be, this is still only an idea and I have to think
more :)
> Jacob
Regards
--
Julian Anastasov <ja@xxxxxx>
|