Hello,
On Fri, 16 Jun 2000, Wensong Zhang wrote:
> > For VS/NAT demasq is broken. We must restore the
> > original packet in ip_fw_unmasq_icmp when called from
> > icmp_send (called from ip_forward after the packet mangling
> > and after 2nd ip_route_input). (I hope nobody complains the
>
> I am sorry that I forgot to restore the ip header of the mangled packet
> before icmp_send, we have to hook ip_fw_unmasq_icmp to restore packet. You
> are right.
Yes, in 2.2 the packets must be restored which is
not good practice but this can hurt only masq applications
which change data (which is may be not fatal, I'm not sure).
>
> > 576 data bytes are not restored). There is no such thing in
> > 2.3, the packet restoring is a big pain and it was removed
> > but that doesn't means 2.3 looks correct when sending
> > ICMP_FRAG_NEEDED from ip_forward(). We are not sure if the
> > packet is not already changed from the PRE_ROUTING? The
> > result: rewritten iph->daddr (internal address) is returned
> > in the ICMP reply. Is my interpretation correct? I didn't
> > tested it yet.
> >
>
> Yeah, I agree with you. The netfilter probably have problems in calling
> icmp_send for already-mangled packets. I think it is a need to restore ip
> header of the packet before calling icmp_send.
I think, with Netfilter many things look well
structured but I'm not sure for the ICMP_FRAG_NEEDED
generation. The packet restoring must be avoided, if
possible, if it involves changes in the protocol data, i.e.
not very good for masq apps. I assume this is not planned
in the 2.3 world without masq apps.
> > I think we have to make two changes:
> >
> > - ip_vs_dr_xmit(): check mtu and reply
> >
> > - ip_fw_unmasq_icmp(): we must put only our out_get() check,
> > there is no reason for the in_get() check (nothing is
> > changed before ip_forward in the masq direction).
> >
>
> Yup, we must do the two changes. However, for external_MTU < internal_MTU
> in VS/NAT, the ICMP_FRAG_NEEDED will be sent before mangling the packets,
> so there is no need to do in_get() check, but I am not sure whether other
> ICMP messages will be sent to the real server after the packet is mangled.
>
> Anyway, I don't think that it hurts if we add in_get checking in
> ip_fw_unmasq_masq for VS/NAT.
OK, add in_get(). I'm not sure too because may be
the data can be extended one day which can again return us
to the problem. But now this is not handled from the kernel
after ip_fw_masquerade and I think in_get is not needed. I
can't see how the packet can be mangled before ip_forward,
the mtu check is before ip_fw_masquerade. If the packet is
changed it can't be from LVS (we raise IPSKB_MASQUERADED). I
have to check it because there is a possible speed problem
here. I'm not sure what will be the rate of these
generations. I'm not sure if some MASQ patches will not
break LVS, so better to add in_get, we will see the results.
At least, this setup is not common.
May be this problem doesn't look big for the net
folks, I'm not sure.
I think, we have to answer these questions for 2.3:
- should we use header restoring or not. It must be planned
with the masq apps support if any, i.e. whether the data
will be changed too. It is very difficult to restore data.
For the header it is easy.
- how can we call each hook from PRE_ROUTING to revert its
header or data changes if each this hook returns NF_ACCEPT
instead of NF_STOLEN. It is not possible.
The result:
- don't try to restore header from icmp_send
- if something is changed the hook must return NF_STOLEN
and process the packet: routing, mtu check, mangling and
forwarding
- return ICMP_FRAG_NEEDED before mangling. Here is the
problem (not for LVS), we must know the output device mtu
before mangling. But the kernels call ip_forward() when
the packets are ready to send, i.e. after mangling without a
way to restore them.
At least, these thoughts don't correspond with the
current packet filter hooks and the packet forwarding.
But may be I'm missing something. If the above is
correct the "ext_mtu > int_mtu" problem can break any
design. LVS have to do the steps ignoring the current kernel
structure. This will improve the speed, though. The other
way is just not to solve this problem. This can be bad for
some guys with external gigabits and many internal megabits.
Is that true?
Regards
--
Julian Anastasov <uli@xxxxxxxxxxxxxxxxxxxxxx>
|