On Wed, 21 Dec 2005, Jan Abraham wrote:
On Wednesday 21 December 2005 16:40 Joseph Mack NA3T wrote:
db2 ~ # arptables -L -n
Chain INPUT (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)
-j DROP -s 10.0.4.1
-j DROP -s 10.0.4.2
why do you drop these (just curious, not related to your
problem)?
It's my way to solve the good old arp problem - simply
drop all arp replies coming from a specific VIP on the
realserver...
really? How do packets from the VIP (ie 10.0.4.[12]) get
back to the client? Wouldn't they be dropped too?
there's a lot of detail here. Are you using a different VIP
for the database than for the web front end (I assume yes)?
Yes, of course.
Just checking that I understood what you said. We do have
code that allows a realserver to be a client of the LVS to a
VIP that is also on the realserver (see the HOWTO) but
no-one's tested it yet.
The web servers are balanced by a different director and
using different VIP/RIPs.
Since the packets arrive at the realserver, I expect it's
not an LVS problem, I would then look for crazy things.
That were my thoughts as well, but it works perfectly when
using application based loadbalancing accessing the
database servers directly.
Oh right, you said that earlier. I forgot.
The setup i send was stripped down. In fact, there are 17
realservers. The phenomenon can be detected on every
single one.
I see.
Can you replace the realserver hardware, software (different
kernel say)?
The realservers have a wide spectrum of different cpus,
boards, nics and kernels so I can exclude a specific
combination to be the cause. Wish it would be that easy...
:/
hmm. You've already been there. Well I guess then the next
suspect is LVS.
Summary:
The SYN packet arrives from the webserver realserver (in the
webserver LVS). This realserver is a client for the database
LVS and the packet goes through the database director to the
database realserver. The database realserver doesn't appear
to see the SYN packet, but the src/dest IP and ports, and
the MAC address are OK. You only see this with 0.1% if SYN
packets but not with other packets. You don't see this with
other (non SYN) packets. Do you know if non-SYN packets
aren't recognised too and you don't see any problem because
the packets are resent, or is it that it's only a problem
with SYN packets? The problem doesn't occur if the database
client contacts the database server directly.
Is the director (VIP) for the database on the same box as
the director (VIP) for the webserver? If so, does splitting
the VIPs onto two different machines stop the problem?
Joe
--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!
|