On Tue, Nov 28, 2006 at 08:37:28AM -0500, RU Admin wrote:
[snip]
> >>When running "ipvsadm -lcn", I can
> >>see connections with the CLOSE state going from 00:59 to 00:01, and
> >>then magically going back to 00:59 again for no reason. The same
> >>holds true for ESTABLISHED connections, I see them go from 29:59 to
> >>00:01 and then back to 29:59, and I know for a fact that the
> >>connection from the client has ended.
> >
> >I seem to recall a bug relating to connection entries having
> >the behaviour you describe above due to a race in reference counting.
> >Which version of the kernel do you have? Is there any chance of updating
> >it to something like 2.6.18?
>
> I'm using a stock Debian Sarge kernel (2.6.8-2-686-smp), I can
> definitely build the latest kernel, and if you feel that it will help
> then I'll do that. It's always risky making a major kernel change on
> a production machine, which is why I wanted to hold off from making
> that change until someone else familiar with IPVS, felt that it might
> help.
I think that it would be worth trying. Can you reproduce the problem
on a non-production machine?
[snip]
> >I am wondering if the problem is that for some reason the
> >linux-directors are not seeing the part of the close sequence
> >that is sent by the end-user (it won't see the portion sent by
> >the real-servers). Supposing for a minute that this is the case,
> >it would explain the strange numbers, and those strange numbers
> >will be effecting how wlc allocates connections.
>
> But shouldn't IPVS timeout? I thought that was the purpose of the
> timeouts...
> So that when the director doesn't see a close event after a specified period
> of
> time, it simply times out.
I actually think my close theory is wrong and that as you point out the
problem is timeouts. I think that you are correct in thinking that they
should time out. So that seems to leave us with two main possiblilities
1) there is a bug (which may have already been fixed) or 2) we are
reading the data wrong.
[snip]
> >How exactly did you deal with ARP, there are several methods.
>
> On the real servers, I'm first bringing up the dummy0 interface with the VIP,
> then I use "sysctl" and set the following:
> net.ipv4.conf.dummy0.rp_filter=0
> net.ipv4.conf.dummy0.arp_ignore=1
> net.ipv4.conf.dummy0.arp_announce=2
> Then I bring up eth0 with the real server's regular IP address, and with
> "sysctl", I set the following (includes a repeat of the above options):
> net.ipv4.conf.default.rp_filter=0
> net.ipv4.conf.all.rp_filter=0
> net.ipv4.conf.lo.rp_filter=0
> net.ipv4.conf.dummy0.rp_filter=0
> net.ipv4.conf.eth0.rp_filter=0
>
> net.ipv4.conf.default.arp_ignore=1
> net.ipv4.conf.all.arp_ignore=1
> net.ipv4.conf.lo.arp_ignore=1
> net.ipv4.conf.dummy0.arp_ignore=1
> net.ipv4.conf.eth0.arp_ignore=1
>
> net.ipv4.conf.default.arp_announce=2
> net.ipv4.conf.all.arp_announce=2
> net.ipv4.conf.lo.arp_announce=2
> net.ipv4.conf.dummy0.arp_announce=2
> net.ipv4.conf.eth0.arp_announce=2
>
> The ARP problem was the one thing that kept me from moving to LVS-DR
> for a long time. I finally started playing with all of the
> net.ipv4.conf options and bringing up the interfaces in a specific
> order, and finally stumbled across a method that actually worked. I'm
> sure some of the above options don't need to be set, but it finally
> works, and I'm a little afraid to touch it.
What you have above is the prefered method these days.
You shouldn't need to bother with lo and dummy0 as these are non-arping
interfaces (right?). Though setting them is harmless.
In any case, I agree with your analysis that ARP does not seem to be
a problem in your setup, as the connections are being forwarded by
the linux-director.
> I'm going to try and build the latest 2.6.18 now, and hopefully
> sometime later this week I can install the new kernel and reboot our
> director. Unfortunately I've never been able to get keepalived to
> handle a MASTER/SLAVE director properly, so I only have one director
> in front of the real servers, so if I make a mistake, our main
> university email server will be down.
ew. Good luck :)
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
|