Re: [lvs-users] Connecting directly to realservers in a one-network LVS-

To: " users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] Connecting directly to realservers in a one-network LVS-NAT
From: "Ben Hollingsworth" <ben.hollingsworth@xxxxxxxxxxxx>
Date: Mon, 26 Nov 2007 16:35:33 -0600
Joseph Mack NA3T wrote:
>> However, I did setup a one-network LVS-NAT just last week 
>> that works fine.  Our private network is a subset of our 
>> public network, with the real servers using the gateway 
>> VIP on the directors.  The directors know nothing of SSH, 
>> yet if a client tries to SSH directly to the private IP of 
>> the real server, it succeeds, even though the packets take 
>> a circuitous return trip through the directors.
> hmm. so with redirects etc off and the ipvsadm table still 
> setup for one-network NAT (and no iptables or conntrack), 
> then a packet RIP->CIP sent to default gw=VIP on the 
> director, is not NAT'ed on the director, by the rules setup 
> by ipvsadm, which would make the packet come out with 
> src_addr=VIP and hence be refused by the client?
> I'm trying to figure out what the director would think it's 
> supposed to do with such a packet; forward it or NAT it? I 
> guess it depends on who gets first dibs on the packet, the 
> forwarding rules or the NAT rules. This must be easy enough 
> to look up.

Apparently, the forwarding rules get first dibs.  In my environment,
when the director sees a packet come back from the private side that
didn't first come through addressed to the VIP, then the director just
acts as a router and dutifully forward the packet wherever it thinks it
should go without NATting it.  No iptables or conntrack is used.

BTW, in the default setup, the director merely sends an ICMP redirect
back to the real server, which causes problems under some
circumstances.  I had to set "net.ipv4.conf.default.send_redirects = 0"
to get it to work consistently.

Our setup actually ended up a little different than what's recommended
in the literature, but I think it works quite well.  Before LVS was
ready, we already had installed software on the real servers (using six
consecutive IP's -- poor planning due to never having done this before),
and it would take a solid week of work to change these IP's within the
software (Oracle Collab Suite).  The servers sit within a /18 (64 class
C's as a single subnet) that's designated to that row in our server
room.  We started out by carving out a private /27 subnet that contained
all our pre-set RIPs, but ran into problems because our RS's are setup
in clustered pairs, and each pair needed to be able to address the other
pair by its VIP.  The directors refused to NAT the return packets on a
connection that both originated and terminated within the private
subnet, so this wouldn't work.

What we ended up doing was dissolving the private subnet entirely.  Each
RS thinks that it's on a /32 (1-host) subnet that contains only itself. 
We forced a routing rule that tells it the default route is to the
virtual gateway on eth0, even though it doesn't have a subnet route for
that gateway.  The RS routing table looks like this:

# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt
Iface UH        0 0          0
eth0         UG        0 0          0
eth0 is the virtual gateway on the director.  The down side
here is that any communication amongst the RS's gets bounced off the
director.  In our low-volume environment, that's not a problem.  We're
balancing for availability, not throughput.

Does this all make sense?  Are you all cringing yet?  We didn't exactly
plan this layout; it's just where we ended up after we'd fixed all the
problems we encountered along the way.

CONFIDENTIALITY NOTICE: This e-mail message,including any
attachments,is for the sole use of the intended recipient(s)
and may contain confidential and privileged information. Any
unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient,please
contact the sender by reply e-mail and destroy all copies
of the original message.

<Prev in Thread] Current Thread [Next in Thread>