Hi all
I'm running what you'd probably term a fairly classical LVS-NAT setup:
Clients
Director1 Director2
Realservers
The directors are running FC2 with keepalived-1.1.7 managing both the IPVS
framework and the VRRP side of things. This is (currently) an active/passive
system where one director is backup for the other one.
The realservers are a mix of FC1, FC2 and W2K3 servers (web/email/FTP farm).
All realservers use the active director for their default gw.
Traffic sourced from processes on the realservers (DNS etc) is masqueraded as
though it all comes from one address (the real IP on the director). Traffic
involved in LVS transactions should obviously be managed by the ip_vs modules,
and source-natted on the return journey to come from the VIP they arrived on.
I am, however, intermittently seeing my remote NMS system complain that all
the VSes have gone down - a few seconds (45 to 90 or so) later, they come back
up. Because the NMS is remote I initially put this down to being network
problems, but it persisted.
Digging deeper what I'm seeing is *all* outbound traffic, whether LVS NAT or
local sourced, being source NATted to the real IP on the director. This only
lasts a few seconds at a time but it affects everything which is leaving the
farm, and clearly this then breaks established sessions as the return traffic
to the clients isn't coming from the right place.
I've followed up on John Reuning's thread from last year:
http://marc.theaimsgroup.com/?l=linux-virtual-server&m=105577688603838&w=2
and he supplied a workaround that by adding SNAT lines to the 'nat' table I
could lock outbound
traffic from realservers to a VIP. I had thought of this, but it will only
partially work as I have a fairly complex setup of VIPs and realservers where
protocols, VIPs and RSes are "shared" - I may be forwarding port 80 on one VIP
to (some of) the same realservers as another VIP.
When the problem occurs, it affects *all* VIPs simultaneously (and is
happening as I write this).
It looks to me like there's some sort of faulty interaction between the
connection tracking in the ip_vs module, and that in the iptable_nat or
ip_conntrack module - on occasion the iptables modules are handling ip_vs
traffic when they shouldn't.
As I've seen this mentioned once, and seen Joe mention that he'd seen it
mentioned twice, I thought I'd mention it a third (or fourth?) time :)
Any assistance will be greatly appreciated. The up/down flaps of my VIPs are
starting to affect my sanity!
Graeme
|