On Wed, Jul 01, 2020 at 06:17:19PM +0300, Julian Anastasov wrote:
> YangYuxi is reporting that connection reuse
> is causing one-second delay when SYN hits
> existing connection in TIME_WAIT state.
> Such delay was added to give time to expire
> both the IPVS connection and the corresponding
> conntrack. This was considered a rare case
> at that time but it is causing problem for
> some environments such as Kubernetes.
>
> As nf_conntrack_tcp_packet() can decide to
> release the conntrack in TIME_WAIT state and
> to replace it with a fresh NEW conntrack, we
> can use this to allow rescheduling just by
> tuning our check: if the conntrack is
> confirmed we can not schedule it to different
> real server and the one-second delay still
> applies but if new conntrack was created,
> we are free to select new real server without
> any delays.
>
> YangYuxi lists some of the problem reports:
>
> - One second connection delay in masquerading mode:
> https://marc.info/?t=151683118100004&r=1&w=2
>
> - IPVS low throughput #70747
> https://github.com/kubernetes/kubernetes/issues/70747
>
> - Apache Bench can fill up ipvs service proxy in seconds #544
> https://github.com/cloudnativelabs/kube-router/issues/544
>
> - Additional 1s latency in `host -> service IP -> pod`
> https://github.com/kubernetes/kubernetes/issues/90854
>
> Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")
> Co-developed-by: YangYuxi <yx.atom1@xxxxxxxxx>
> Signed-off-by: YangYuxi <yx.atom1@xxxxxxxxx>
> Signed-off-by: Julian Anastasov <ja@xxxxxx>
Thanks, this looks good to me.
Reviewed-by: Simon Horman <horms@xxxxxxxxxxxx>
Pablo, could you consider applying this to nf-next?
|