I know my California example is a bit extreme, but I
wanted to be sure we were talking about a complete
datacenter outage. If I had a dedicated cabinet with
one CAT5 cable running to it, and some 3rd-shift
network engineer with too much coffee in their blood
knocks my feed from the datacenter switch, I consider
that a "data center outage" for our discussion
purposes. Or what if a core router starts spewing out
faulty route broadcasts that quickly spread and
corrupt the routes of member routers ... effectively
crippling a network (or internet backbone) for hours.
Human error is very real. I'm sure hundreds of
*realistic* scenarios could be thought of to justify
off-site redundancy, so lets just move on.
I mentioned VRRP to stress the desire for
near-instantaneous failover ... minimizing the amount
of downtime a client accessing your site may
experience.
It should all be transparent as far as they're
concerned. Obviously, without considerable expense,
this is not achievable. It looks like the best
solution for this scenario is one using DNS with low
(30 sec?) TTL values. It won't immediately failover
your services, but it may reduce your downtime from
hours to [several] minutes should some major outage
occur at your primary datacenter. Nick, I agree with
you that DNS solutions are less than ideal,
considering there are so many factors out of your
control like caching DNS servers that ignore your TTL
values, but it seems to be the only solution for
cost-conscious companies forced to provide three or
more 9's of service to their clients.
For those of you out there who have ever supplied HA
services (SOAP Web Services in our case) to
Fortune-500,100,etc level companies know the
importance of redundant facilities in your service
offering or RFP replies. You won't make it though
their due-dillegence process without it.
I want to thank all who have contributed to this
thread (on and off list) and acted as sounding boards
for my discussion. I felt a "Global" or "Datacenter-
level" failover solution hadn't been discussed in
enough detail in any online forum I'd found, and the
LVS group seemed to be the perfect one.
Thanks!
-Ken
--- nick garratt <nick-lvs@xxxxxxxxxxxxxx> wrote:
Well a State falling off the map is hardly a failure
> situation that
makes sense building in 60s minimum latency cutover
for. What if the
United States fell off the map ? What if the map
ceased to exist ?
Keeping your DNS TTLs really low can help you
somewhat in this
situation, although they certainly cannot be set to
not cache at all.
Also I have also encountered DNS servers that do not
correctly
observe these settings. You have no control over all
the intermediate
name servers that might be caching your DNS records
and thus is not
suited to low latency failover.
VRRP is basically IP failover through election and
is not relevant to
this discussion.
It seems to me the only satisfactory solution is for
you to apply for
your own autonomous system (at considerable cost)
which will allow
you full control of your BGP data. It will be
possible with
cooperation for other AS admins to ensure
substantial route
redundancy and rapid cutovers should you lose a
datacentre/state/continent :)
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
_______________________________________________
LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://www.in-addr.de/mailman/listinfo/lvs-users