On Tue, Aug 13, 2002 at 11:51:31AM -0600, Greg Woods wrote:
> > So what's stopping you from modifying /etc/rc.d/init.d/network to suit your
> > needs? it's only a shell script... :)
>
> I ended up having to do this, but I don't like it. The reason I don't like
> it is that the next RPM update, which may be six months from now when I've
> long since forgotten about that mod I made, will wipe it out and suddenly
> things break.
That is true. But IMHO the redhat network scripts are continuosly
(though always in new and ineteresting ways) broken so you are bound to
have to modify them in the end. I would suggest that if you are
concerned about upgrading over the top of your changes - which you
should be - then you make your own package and install that. That should
make it easier to keep your changes through rpm.
> In fact, I had to do this because of a race condition. I'm setting up
> my LVS mail system using an alias on lo:0 as the service address on
> the real servers, so that I can set the hidden flag only for the lo
> device and still make connections directly to individual servers when
> needed. The race condition occurs because /etc/rc.d/init.d/network
> runs before /etc/rc.d/rc.local during the boot process. This brings up
> all of the net interfaces before the hidden flag gets set, and if the
> router ARP cache entry happens to time out while a real server is
> booting, the real server might answer the ARP request and this screws
> up LVS until the router ARP cache can be cleared. In our
> organization, different people manage the routers than the mail
> servers, so getting the router ARP cache cleared is a pain since I
> cannot do it myself. The only way I could find to remove the race
> condition was to have the hidden flag on lo set in
> /etc/rc.d/init.d/network, after the lo interface is brought up but
> before any of the eth interfaces come up.
>
> Anybody else seen this race condition before?
That is definately a problem. I had thought of that myself
but hadn't actually run into it. Again modifying the init scripts
are really the best answer.
One thing you may want to look into is the <blah>/default/hidden
flag. You could toggle this. Then all interfaces would come
up as hidden, and you could un-hide eth0. This should also
stop the race condition, though not having arp on eth0 for more
than a moment probably isn't a good idea.
> The next problem we have, I've mentioned before here, and that's
> dynamic ARP caching which triggers another race condition. It really
> does look like, when a router's ARP cache entry for the VIP times out
> or is cleared, if there happens to be an outbound packet going through
> the router and that's the first one it sees, it will cache the MAC
> address associated with this packet without ever issuing an explicit
> ARP. I have verified this by monitoring ARP requests while this is
> going on. I have also verified that only the director will answer an
> explicit ARP, as it should be, yet somehow, once in a while, the MAC
> address of one of the real servers gets stuck in the router's ARP
> cache associated with the VIP address. This blows the load balancing
> out of the water because now EVERY incoming connection goes directly
> from the router to the MAC address of this real server.
>
> I know I must somehow be doing something wrong, because Cisco routers
> are quite common and I would have expected someone else to have run
> into this by now and nobody seems to have done so. But I do know this:
> 1) No real server is answering the router's ARP request; and 2)
> Somehow the MAC address of a real server can sneak into the router's
> ARP cache anyway.
>
> Oh, yes, and since these machines are all running 2.4.18 kernels and
> the real servers all use the 2.4 hidden patch, this is at least sort
> of relevant to the original thread :-)
I am intrigued to know if issuing gratuitous ARP packets from the Linux
Director is a work around for this.
--
Horms
|