On 2010-03-29 06:18, Simon Horman wrote:
> On Sun, Mar 28, 2010 at 12:31:20PM +0200, svensven wrote:
>> 1. Should the ip_vs_conn_in_get() function also take fwmark into
>> consideration when matching incoming packets to its list of
>> established ipvs connections?
>
> I suspect not, as the connection table doesn't include fwmark
> information. And I think that there ought to be a simper resolution
> to your problem than refactoring connection table entries.
That is fair.
>> 2. Is this the right way of setting up a two-node LVS setup with
>> localnodes and connection synchronization on a modern kernel?
>> (Assuming the conn sync would not break)
>
> I think that you could get around this problem by only activating
> the LVS rules on the master-node. Or is that already the case?
Hm, that is not the case. The LVS rules are active on both nodes all
the time. Do you mean it would make sense to add ipvsadm -A|E|D rules
to the notify_{master,backup} scripts, so the config would be altered
upon state change? Instead of keeping it statically in keepalived.conf
that is.
Before I try this, see my comment for the next point too.
>> 28 [61.019] IPVS: lookup/in TCP 10.0.0.3:54590->10.0.0.10:9999 hit
>> 29 [61.019] IPVS: Enter: ip_vs_dr_xmit, net/netfilter/ipvs/ip_vs_xmit.c
>
> I think that this is critical to the problem. That is
> ip_vs_dr_xmit() is being called which causes a loop. I suspect that
> ip_vs_null_xmit() should be called and if so the loop wouldn't
> occur.
>
> Could you post the output of "ipvsadm -Ln" ?
Aha. I see that for ip_vs_null_xmit() to be called, the destination
must be be flagged with IP_VS_CONN_F_LOCALNODE, which is set if
inet_addr_type() returns RTN_LOCAL. In my case, it obviously doesn't.
Output of "ipvsadm -Ln" (LVS A / realserver A has its lighttpd
stopped, so it's not in the list):
LVS A (master, own IP address 10.0.0.5):
FWM 10 rr
-> 10.0.0.6:9999 Route 1 0 0
LVS B (backup, own IP address 10.0.0.6):
FWM 10 rr
-> 10.0.0.6:9999 Local 1 0 0
Note that the IP is considered local, which is fine (it's the node's
own eth0 IP address). Now, for some reason it does not consider the
virtual IP 10.0.0.10 local. I thought this could be due to a missing
route entry. However, I added it and the problem still persists:
LVS B:# ip route show
10.0.0.10 dev lo scope link
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.6
default via 10.0.0.1 dev eth0
LVS B:# ip route get 10.0.0.10
local 10.0.0.10 dev lo src 10.0.0.10
cache <local> mtu 16436 advmss 16396 hoplimit 64
The massive amount of code behind inet_addr_type() is too much for me
to trace, although route.c:ip_route_output_slow() (pretty much the
only relevant place where RTN_LOCAL is assigned to a var) seems to
indicate that if the output interface is lo, the type should be set to
RTN_LOCAL. I don't know why this is not the case, and it seems to be
the root of the problem.
best regards,
S.
_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users
|