On Wednesday 21 December 2005 16:40 Joseph Mack NA3T wrote:
> > db2 ~ # arptables -L -n
> > Chain INPUT (policy ACCEPT)
> > Chain OUTPUT (policy ACCEPT)
> > -j DROP -s 10.0.4.1
> > -j DROP -s 10.0.4.2
>
> why do you drop these (just curious, not related to your
> problem)?
It's my way to solve the good old arp problem - simply drop all arp replies
coming from a specific VIP on the realserver...
> there's a lot of detail here. Are you using a different VIP
> for the database than for the web front end (I assume yes)?
Yes, of course. The web servers are balanced by a different director and using
different VIP/RIPs.
> Since the packets arrive at the realserver, I expect it's
> not an LVS problem, I would then look for crazy things.
That were my thoughts as well, but it works perfectly when using application
based loadbalancing accessing the database servers directly.
> Trying to debug something that works 99.9% of the time is
> not going to be fun, unless you can work out a way of
> screening out the 99.9%, Is it only one realserver, both?
The setup i send was stripped down. In fact, there are 17 realservers. The
phenomenon can be detected on every single one.
> Can you replace the realserver hardware, software (different
> kernel say)?
The realservers have a wide spectrum of different cpus, boards, nics and
kernels so I can exclude a specific combination to be the cause. Wish it
would be that easy... :/
|