On Mon, 27 Nov 2006, Simon Pearce wrote:
I have a total of about 250 IP addresses to migrate and
here's where the problems start. Everytime time the dns
cluster exceedes a certain limit some of the ip addresses
stop working properly.
From Wayne's posting it's possible that this may not work
with our setup, but since I don't know why, go I'll just
forge ahead anyhow.
Ted Pavlic, back in the early days, had a director with 1024
IPs, so it's not the large number of IPs, at least for TCP
There was a posting (in the last month I'd guess) where
someone's UDP balancing was not working properly and the
suggested solution was Julian's UDP single packet scheduler
patch. I forget their symptoms, but they aren't your
symptoms, but there may be problems with UDP we haven't
found because no-one is stressing UDP balancing very hard.
It effects the system in a way
that for certain domains you get a timeout when querying the cluster.
Some of the transfered IP's
transferred IPs? these are just the VIPs, that you have
running on the LVS cluster, nothing special, just VIPs?
seem to stop working or slow down to an
extend that other dns servers stop querying us.
do you know which IP's these are? Anything strange in the
output of ipvsadm, netstat on the realservers for these IPs?
I am also using iptables on the two load balancers with a
conntrack table because the real servers have private ip
addresses and i can't update them otherwise.
I don't know the connection between conntrack and private
IP's. Want to enlighten me?
I checked the
logs but i can't find any info that the conntrack tables
is full. But i read on the lvs list that the conntrack
tables ist not needed for lvs nat and can slow the system
down i am however not sure about this?
can you do a test with conntrack off?
Is there anything
else someone could think of that i might have done wrong.
The unuseal thing is that the cluster seems to work fine
untill the load exceedes a certain limit i menchioned
earlier which i can't really define in words.
Is the problem load or the number of IPs (if you can tell)?
There is another problem with failover of large numbers of
IPs, just incase you want to read more on the topic (it may
not be related to your problem).
Can you setup ipvsadm with a single fwmark instead of all
the IPs? That would shift the responsibility for handling
all the IPs to iptables, rather than ipvsadm.
Do you have a large iptables rule set that might be slowing
things down? iptables scales with O(n^2); still 250 IPs
doesn't seem a lot of IPs.
Are we having UDP problems here?
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!