LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Reason for using LVS

To: lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: Reason for using LVS
From: "Matthew S. Crocker" <matthew@xxxxxxxxxxx>
Date: Wed, 30 Oct 2002 07:55:23 -0500 (EST)
> 
> My objective for a realserver outage is:
> 1) client not having to refresh their browser
> 2) client need not reauthenticate (for https connection) 
> 
> What I mean by a realserver outage is (in steps):
> 1) healthcheck polls that realserver1 is up
> 2) then immediately after that realserver1 is down
> 3) director is forwarding new connection to realserver1
> 
> What will happen to that connection?  Will it be lost (in which case my 
> objective 1 is not met)?  Will it be forwarded to another realserver?
> 

The connection will be lost. The client will send a SYN to the VIP which 
will be forwarded to the real server1.  If that server is completely down 
and not responding to any network traffic the SYN will hit the switch and 
disapear.  If the server is up but the service (i.e. Apache) is down the 
server will get the SYN and should respond with an ICMP port unreachable.

> For objective (1), I can't find any other way but to decrease the healthcheck 
> polling time.

Decrease your polling time,  increase the number of real servers.

Example:

  100 new connections/sec
  1s  healthcheck
  10  real servers

On average you'll have 10 new connections dropped by the dead real server.
Depending on your real server reliability that may be an acceptable 
number.

Personally I wouldn't run the healthcheck at 1s.  I keep mine at 5s.

> For objective (2), would putting PHP applications in a distributed filesystem 
> help?

LVS does NOT move connections from one real server to another.  If a 
client has any connection to a real server and that server dies the 
connection is dropped. The client will need to reconnect to a new real 
server.  The reconnection will re-establish SSL. The authentication part 
is up to you.  If you distribute your sessions via a database the 
client should be able to authenticate transparently.  A ditributed file 
system is not the correct way of sharing the session info between the real 
servers.

LVS is the best solution for your objectives.  You can build in so much
redundancy that it gets a bit silly.  Set your healthcheck to 10ms and
have the LVS servers either crash from the load or take out real servers
with a DoS attack.

My setup, which handles about 1 Million e-mail messages/day is:

2 Cisco routers handling BGP and upstream connections with HSRP fail over 
between them.  Routers have 100mb interconnect

2 Cisco 3548 switches connect to the routers (Router A to Switch A, Router 
B to Switch B.  Switches have 2x100mb interconnects

2 Linux LVS boxes running LVS-NAT (LVS-A on Switch A, LVS-B on Switch B

4 Linux real servers runing qmail, POP3, IMAP (2 groups of 2 servers.  
Group A on Switch A,  Group B on Switch B

Core VLAN has Routers + LVS public side + VIPS
DMZ Vlan has LVS private side + real servers.

LVS A is primary LVS,  LVS B is backup
Router A is primary HSRP,  Router B is backup

Data is stored on a Netfiler F720 *ack* my only single point of failure. 

VLANS are split across both switches.

normal inbound traffic flow is:

Internet -> Router A -> Switch A -> LVS A -> Switch A -> RealServer 1 or 2
                             ^                  +-> Switch B -> RS 3 or 4
                             |                 
*or*                         +---+            
                                 |            
Internet -> Router B -> Switch B +

normal outbound traffic flow:

RS 1 or 2 -> Switch A -> LVS A -> Switch A -> Router A -> Internet
*or*              ^                              |
                  |                              +> Router B -> Internet
                  |                            
RS 3 or 4 -> SW B +
                             
I can currently sustain a failure of either router, switch or LVS box.  
Normal transfer is under 5 seconds.  LVS boxes run lvs_sync so they have a 
connection map.  If I lose a switch  I end up losing that whole side of 
the cluster.

Next level of redundancy would be to installed 2 extra NICs in every 
machine and cross connect them to each switch.  New nics would be offline 
until a switch fails.

I'm happy with this level of redundancy.  If I was going to do it again I 
would probably go with a PCIMG 2.16 chassis with the dual switch matrix 
cards.  It would be basically the same schematic but the wiring would be a 
lot nicer.                   

My next big project is to setup to linux NFS servers with failover sharing 
a fiber channel RAID array.  I really LOVE the netfiler but I can't afford 
to upgrade to two of them with cluster failover.

-Matt

-- 
----------------------------------------------------------------------
Matthew S. Crocker 
Vice President / Internet Division         Email: matthew@xxxxxxxxxxx
Crocker Communications                     Phone: (413) 746-2760
PO BOX 710                                 Fax:   (413) 746-3704
Greenfield, MA 01302-0710                  http://www.crocker.com
----------------------------------------------------------------------



<Prev in Thread] Current Thread [Next in Thread>