> I'm coming in on this thread a bit late and I seem
> to be missing some
> information here.
>
In a nutshell, odd things were happening. A little
more verbose: over the course of 4 hours (almost to
the minute) the VIPs would become unreachable. It
would start with machines on the same segment not
being able to reach the VIPs, and eventually nobody
could get to them. The temp solution was to
force-fail over to the secondary director, and wait
for it to happen again and force-fail back to the
primary, rinse, repeat. (not the best of situations in
a production environment)
It took me almost 20 hours of research (B.O.S.S said I
wasn't allowed to leave until I had a solution or we
were scrapping LVS for some other solution), getting
help from the LAN/WAN team, experimenting with dev
servers, etc, to find out exactly what was happening
and fix it.
I started seeing weird things, like "arp -a" was
showing (incomplete) for the mac address of the VIPs
when that machine was no longer able to contact it.
And it didn't happen to all 4 VIPs at once, it was one
at a time, each almost an hour apart before the rest
of the network suddenly couldn't get to any of them.
> I tried several methods to
> > get to the bottom of the problem, and in the end,
> > noticed I was setting the hidden flag on the dummy
> > interface AFTER the 4 ip's were assigned...
>
> you have 4VIPs on each realserver?
The 4 ip's are on the dummy0 interface, using the
hidden patch to stop ARP. I can't dial into work
right now to get the exact lines, but this is
something like what I had before:
modprobe dummy
echo 1 > /proc/???/all/hidden
ifconfig dummy0:0 xxx.xxx.xxx.xxx netmask
255.255.255.255
ifconfig dummy0:1 xxx.xxx.xxx.xxx netmask
255.255.255.255
ifconfig dummy0:2 xxx.xxx.xxx.xxx netmask
255.255.255.255
ifconfig dummy0:3 xxx.xxx.xxx.xxx netmask
255.255.255.255
echo 1 > /proc/???/dummy/hidden
I saw in some mailing list archives that people were
adding a line "ifconfig dummy0 0.0.0.0 up" and doing
the second hidden before assigning any ip's. That
seemed to do the trick, but made the primary director
unable to be a real server because heartbeat won't
startup the ip's (that's minor at this point though).
> your directors are realservers too? Your backup
> director is a realserver
> until it has to become the master director?
Until yesterday, for the last 3 months on these
servers, the master and backup directors were real
servers also. We did this for a year with some older
servers we replaced these with. The difference is we
used forwarding with ipchains on the real servers, and
that worked like a charm. With the new servers apache
just didn't like that setup so I was forced to put the
ip's on dummy0:X.
> > because I couldn't get heartbeat to bring up the
> > VIP interfaces after I moved where in the process
> the
> > hidden flag is set.
>
> what went wrong?
See my first answer :)
> > How well would it work to allow the secondary to
> > have the dummy interfaces, and serve pages until
> > it has to take over load balance responsibilitys,
> > and hack heartbeat to shutdown apache and bring
> > down the dummy interfaces before bringing up the
> > VIP's?
>
> sounds OK to me.
>
I'll start playing with that then. In our old setup,
I had the IPaddr script hacked to add/remove the
ipchains entries when getting and releasing the IP's
so the backup director could serve pages when it
wasn't in use.
Hopefully this wasn't too long-winded of an email, I
get wordy when I try to explain things :)
Dan.
__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
|