LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Geographically separated load balancers?

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Geographically separated load balancers?
From: Josh Marshall <josh@xxxxxxxxxxxxxxxx>
Date: Wed, 22 Nov 2006 09:17:24 +1000

I don't have the co-operation of our uplinks so
I fake the BGP and with a few scripts it also handles failover to one site. My employer's site www.worldhosting.org is handled this way.


Explain in more detail how you got this working and the scripts used.

First you have to run a patched version of bind9 (I have debian packages for anyone who needs them) - get the source from http://www.supersparrow.org/

Or add the following to your /etc/apt/sources.list for my supersparrow and patched bind9 packages

deb http://debian.worldhosting.org/supersparrow sarge main

(woody packages also available, replace sarge with woody)

Create in your bind config something like:

zone "www.worldhosting.org" {
       type master;
database "ss --host 127.0.0.1 --route_server ssrs --password XXXX --debug --peer 64600=210.18.215.100,64601=193.173.27.8 --self 193.173.27.8 --port 7777 --result_count 1 --soa_host ns.worldhosting.org. --soa_email hostmaster.worldhosting.org. --ns ns.worldhosting.org. --ns ns.au.worldhosting.org. --ttl 7 --ns_ttl 60";
};

This snippet sets the www to use 210.18.215.100 if the peer is set to 64600 and 193.173.27.8 if the peer is 64601, the ttl for the A record is 60 seconds and the self is the default response for this nameserver (on the secondary nameserver make this the other address). Set the password to the same as in /etc/supersparrow.conf

Create three files to describe the routes in normal and failed modes. In our setup:

$ cat ssrs.routes.AUonly
0.0.0.0/0       64600
$ cat ssrs.routes.NLonly
0.0.0.0/0       64601
$ head ssrs.routes.normal
128.184.0.0/16  64600
128.250.0.0/16  64600
129.78.0.0/16   64600
129.94.0.0/16   64600
129.96.0.0/16   64600
129.127.0.0/16  64600
129.180.0.0/16  64600
130.56.0.0/16   64600
130.95.0.0/16   64600
130.102.0.0/16  64600

The ssrs.routes.normal file contains all the subnets you wish to force to use the respective peer.

Create a script that does a http test periodically (we do it every 5 minutes as the web servers don't go down frequently) if both sites work, symlink the file to /etc/ssrs.routes. If only one works, symlink the file for the site that works (ie AUonly or NLonly) to /etc/ssrs.routes. Then check to see if the config has changed and if so, restart supersparrow. I use the check_http script from the nagios package to do the test. See below for my script:

----------------

#!/bin/sh

PATH=/sbin:$PATH
# Supersparrow results
SSNORMAL=0
SSAUONLY=1
SSNLONLY=2

AUIP=210.18.215.100
NLIP=193.173.27.8

AUW=0
NLW=0

#ping -c 2 $AUIP >/dev/null && AUP=1
#ping -c 2 $NLIP >/dev/null && NLP=1

/sbin/check_http -H $NLIP -u /index.html -p 80 -t 20 >/dev/null && NLW=1
/sbin/check_http -H $AUIP -u /index.html -p 80 -t 20 >/dev/null && AUW=1

# Do the tests again in case there was a hiccup

/sbin/check_http -H $NLIP
/sbin/check_http -H $AUIP -u /index.html -p 80 -t 20 >/dev/null && AUW=1

if [ $NLW -eq 1 ]
then
       if [ $AUW -eq 1 ]
       then
               OPMODE="Normal Operation"
               SPARROW=$SSNORMAL
       else
               OPMODE="NL running but AU down"
               SPARROW=$SSNLONLY
       fi
else
       if [ $AUW -eq 1 ]
       then
               OPMODE="AU running but NL down"
               SPARROW=$SSAUONLY
       else
               OPMODE="AU and NL down"
               SPARROW=$SSNORMAL
       fi
fi

if [ $SPARROW -eq $SSNORMAL ]
then
       ln -sf /var/named/supersparrow/ssrs.routes.normal /etc/ssrs.routes
fi

if [ $SPARROW -eq $SSAUONLY ]
then
       ln -sf /var/named/supersparrow/ssrs.routes.AUonly /etc/ssrs.routes
fi
if [ $SPARROW -eq $SSNLONLY ]
then
       ln -sf /var/named/supersparrow/ssrs.routes.NLonly /etc/ssrs.routes
fi

md5sum -c /etc/ssrs.routes.md5sum &>/dev/null && exit
/etc/init.d/supersparrow reload
md5sum /etc/ssrs.routes > /etc/ssrs.routes.md5sum
echo Supersparrow: $OPMODE

-------------

With a DNS server at each location, if there is a international routing problem that prohibits them communicating with each other, then the server will set all responses to point the www at the local hosting location. Then any sites on the net that can get to that DNS server will use the www that is there (and therefore, high chances of working)

Please let me know if there's anything I've missed. This might be worth going into some kind of HOWTO somewhere.

Regards,
Josh.

<Prev in Thread] Current Thread [Next in Thread>