I'm running two systems with ipvs and keepalived. They are a localnode
configuration, that is, the director is also a realserver.
I've found that when I have traffic up and running I can shutdown the
realserver on the director with only a brief burst of failures. My problem is
when I shutdown the realserver on the non-director system.
I have a high traffic load (http) up and running traffic from one
single-threaded client. Life is good.
Then I shutdown the server on the non-director and I get a burst of connection
failures (Connection refused). That clears up quickly and connections start
The problem is that then I see about 10 to 20 seconds of successful
transactions, followed by a period of about a minute where I'm getting
connection timeouts every other time (I'm using rr). Then I move into a period
for the next fifteen minutes where there will be several timeouts about every
20 seconds but otherwise normal traffic.
The initial "Connection refused" failures happen till keepalived turns off the
downed realserver. The part I don't understand is why after seeing traffic come
back, I start seeing the timeouts. I've hooked up tcpdump on the director and
it shows me that every other connection is not getting a response. I looked at
tcpdump on the "downed" realserver and there are no odd packets arriving for
the loadbalanced VIP and port and no evidence of "connection refused" back at
keepalived logs don't give any indication that it's healthcheckers are bouncing
around. ipvsadm -l --stats only shows the functioning realserver.
Does anybody have an idea what's going on here?? This is completely
reproducible and the timing of the connection errors is also consistent.
Please read the documentation before posting - it's available at:
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users