LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: PMTU-D: remember, your load balancer is broken (fwd)

To: Wensong Zhang <wensong@xxxxxxxxxxxx>
Subject: Re: PMTU-D: remember, your load balancer is broken (fwd)
Cc: Kyle Sparger <ksparger@xxxxxxxxxxxxxxxxxxxx>, lvs-users@xxxxxxxxxxxxxxxxxxxxxx
From: Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 16 Jun 2000 13:27:28 +0300 (EEST)
        Hello,

On Fri, 16 Jun 2000, Wensong Zhang wrote:

> >     For  VS/NAT demasq  is broken.  We  must restore the
> > original   packet  in  ip_fw_unmasq_icmp  when  called  from
> > icmp_send  (called from ip_forward after the packet mangling
> > and  after 2nd ip_route_input). (I hope nobody complains the
> 
> I am sorry that I forgot to restore the ip header of the mangled packet
> before icmp_send, we have to hook ip_fw_unmasq_icmp to restore packet. You
> are right.

        Yes,  in 2.2 the  packets must be  restored which is
not  good practice but this  can hurt only masq applications
which change data (which is may be not fatal, I'm not sure).

> 
> > 576 data bytes are not restored).  There is no such thing in
> > 2.3,  the packet restoring is a  big pain and it was removed
> > but  that  doesn't  means  2.3  looks  correct  when sending
> > ICMP_FRAG_NEEDED  from ip_forward(). We are  not sure if the
> > packet  is  not already  changed  from the  PRE_ROUTING? The
> > result:  rewritten iph->daddr (internal address) is returned
> > in  the ICMP reply.  Is my interpretation  correct? I didn't
> > tested it yet.
> > 
> 
> Yeah, I agree with you. The netfilter probably have problems in calling
> icmp_send for already-mangled packets. I think it is a need to restore ip
> header of the packet before calling icmp_send.

        I  think,  with  Netfilter  many  things  look  well
structured   but  I'm  not  sure  for  the  ICMP_FRAG_NEEDED
generation.   The  packet  restoring  must  be  avoided,  if
possible,  if it involves changes in the protocol data, i.e.
not  very good for  masq apps. I assume  this is not planned
in the 2.3 world without masq apps.

> >     I think we have to make two changes:
> > 
> > - ip_vs_dr_xmit(): check mtu and reply
> > 
> > - ip_fw_unmasq_icmp(): we must put only our out_get() check,
> > there  is  no  reason  for the  in_get()  check  (nothing is
> > changed before ip_forward in the masq direction).
> > 
> 
> Yup, we must do the two changes. However, for external_MTU < internal_MTU
> in VS/NAT, the ICMP_FRAG_NEEDED will be sent before mangling the packets,
> so there is no need to do in_get() check, but I am not sure whether other
> ICMP messages will be sent to the real server after the packet is mangled.
> 
> Anyway, I don't think that it hurts if we add in_get checking in
> ip_fw_unmasq_masq for VS/NAT.

        OK,  add in_get(). I'm  not sure too  because may be
the  data can be extended one  day which can again return us
to  the problem. But now this is not handled from the kernel
after  ip_fw_masquerade and I think in_get is not needed.  I
can't  see how the packet  can be mangled before ip_forward,
the  mtu check is before  ip_fw_masquerade. If the packet is
changed it can't be from LVS (we raise IPSKB_MASQUERADED). I
have  to check it because there  is a possible speed problem
here.   I'm  not  sure  what  will  be  the  rate  of  these
generations.  I'm  not sure  if some  MASQ patches  will not
break LVS, so better to add in_get, we will see the results.
At least, this setup is not common.

        May  be this  problem doesn't  look big  for the net
folks, I'm not sure.

        I think, we have to answer these questions for 2.3:

- should  we use header restoring or not. It must be planned
with  the masq  apps support if  any, i.e.  whether the data
will  be changed too. It is  very difficult to restore data.
For the header it is easy.

- how  can we call each hook  from PRE_ROUTING to revert its
header  or data changes if  each this hook returns NF_ACCEPT
instead  of NF_STOLEN. It is not possible.

        The result:

- don't try to restore header from icmp_send

- if  something is  changed the  hook must  return NF_STOLEN
and  process the  packet: routing,  mtu check,  mangling and
forwarding

- return  ICMP_FRAG_NEEDED  before  mangling.   Here  is the
problem  (not for LVS),  we must know  the output device mtu
before  mangling.   But the  kernels call  ip_forward() when
the packets are ready to send, i.e. after mangling without a
way to restore them.

        At  least, these thoughts  don't correspond with the
current packet filter hooks and the packet forwarding.

        But  may be I'm  missing something. If  the above is
correct  the  "ext_mtu  >  int_mtu"  problem  can  break any
design. LVS have to do the steps ignoring the current kernel
structure.   This will improve the  speed, though. The other
way  is just not to solve this  problem. This can be bad for
some guys with external gigabits and many internal megabits.
Is that true?


Regards

--
Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>



<Prev in Thread] Current Thread [Next in Thread>