LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

LVS-NAT: intermittent problem with responses from realservers

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: LVS-NAT: intermittent problem with responses from realservers
From: Graeme Fowler <graeme@xxxxxxxxxxx>
Date: Tue, 5 Oct 2004 10:53:32 +0100 (BST)
Hi all

I'm running what you'd probably term a fairly classical LVS-NAT setup:

      Clients

Director1  Director2

    Realservers

The directors are running FC2 with keepalived-1.1.7 managing both the IPVS 
framework and the VRRP side of things. This is (currently) an active/passive 
system where one director is backup for the other one.
The realservers are a mix of FC1, FC2 and W2K3 servers (web/email/FTP farm).
All realservers use the active director for their default gw.
Traffic sourced from processes on the realservers (DNS etc) is masqueraded as 
though it all comes from one address (the real IP on the director). Traffic 
involved in LVS transactions should obviously be managed by the ip_vs modules, 
and source-natted on the return journey to come from the VIP they arrived on.

I am, however, intermittently seeing my remote NMS system complain that all
the VSes have gone down - a few seconds (45 to 90 or so) later, they come back
up.  Because the NMS is remote I initially put this down to being network
problems, but it persisted.

Digging deeper what I'm seeing is *all* outbound traffic, whether LVS NAT or
local sourced, being source NATted to the real IP on the director. This only
lasts a few seconds at a time but it affects everything which is leaving the
farm, and clearly this then breaks established sessions as the return traffic
to the clients isn't coming from the right place.

I've followed up on John Reuning's thread from last year:

http://marc.theaimsgroup.com/?l=linux-virtual-server&m=105577688603838&w=2

and he supplied a workaround that by adding SNAT lines to the 'nat' table I 
could lock outbound 
traffic from realservers to a VIP. I had thought of this, but it will only 
partially work as I have a fairly complex setup of VIPs and realservers where 
protocols, VIPs and RSes are "shared" - I may be forwarding port 80 on one VIP 
to (some of) the same realservers as another VIP.

When the problem occurs, it affects *all* VIPs simultaneously (and is 
happening as I write this).

It looks to me like there's some sort of faulty interaction between the 
connection tracking in the ip_vs module, and that in the iptable_nat or 
ip_conntrack module - on occasion the iptables modules are handling ip_vs 
traffic when they shouldn't.

As I've seen this mentioned once, and seen Joe mention that he'd seen it 
mentioned twice, I thought I'd mention it a third (or fourth?) time :)

Any assistance will be greatly appreciated. The up/down flaps of my VIPs are 
starting to affect my sanity!

Graeme

<Prev in Thread] Current Thread [Next in Thread>