On average you'll have 10 new connections dropped by the dead real server.
Depending on your real server reliability that may be an acceptable
number.
Personally I wouldn't run the healthcheck at 1s. I keep mine at 5s.
This is almost a research topic sometimes. The timings can get extremely
complicated. Sometimes I have to choose even higher numbers because when I have
too many VIPs where a lot of same RIPs belong to the RS get bombed with
healthchecks all over every interval.
LVS is the best solution for your objectives. You can build in so much
redundancy that it gets a bit silly. Set your healthcheck to 10ms and
have the LVS servers either crash from the load or take out real servers
with a DoS attack.
Exactly. Also as I mentioned before, timing is very complex. You have to
consider failover/failback time and upgrade time and SLA-related service windows
and ...
My setup, which handles about 1 Million e-mail messages/day is:
Impressive.
2 Cisco routers handling BGP and upstream connections with HSRP fail over
between them. Routers have 100mb interconnect
2 Cisco 3548 switches connect to the routers (Router A to Switch A, Router
B to Switch B. Switches have 2x100mb interconnects
2 Linux LVS boxes running LVS-NAT (LVS-A on Switch A, LVS-B on Switch B
4 Linux real servers runing qmail, POP3, IMAP (2 groups of 2 servers.
Group A on Switch A, Group B on Switch B
Core VLAN has Routers + LVS public side + VIPS
DMZ Vlan has LVS private side + real servers.
LVS A is primary LVS, LVS B is backup
Router A is primary HSRP, Router B is backup
Data is stored on a Netfiler F720 *ack* my only single point of failure.
VLANS are split across both switches.
Would you mind (only if you find time) to write up a document on how to achieve
this setup?
normal inbound traffic flow is:
Internet -> Router A -> Switch A -> LVS A -> Switch A -> RealServer 1 or 2
^ +-> Switch B -> RS 3 or 4
|
*or* +---+
|
Internet -> Router B -> Switch B +
normal outbound traffic flow:
RS 1 or 2 -> Switch A -> LVS A -> Switch A -> Router A -> Internet
*or* ^ |
| +> Router B -> Internet
|
RS 3 or 4 -> SW B +
I've got almost the same setup on one of our datacenters.
Next level of redundancy would be to installed 2 extra NICs in every
machine and cross connect them to each switch. New nics would be offline
until a switch fails.
I have done some testing with this and I didn't get it to work really well. But
maybe I should spend some more time on it.
I'm happy with this level of redundancy. If I was going to do it again I
would probably go with a PCIMG 2.16 chassis with the dual switch matrix
cards. It would be basically the same schematic but the wiring would be a
lot nicer.
Any pointers?
My next big project is to setup to linux NFS servers with failover sharing
a fiber channel RAID array. I really LOVE the netfiler but I can't afford
to upgrade to two of them with cluster failover.
Thank you for this insight,
Roberto Nibali, ratz
--
echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
|