LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Choosing distributed filesystem (just how far off topic are wenow ??

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Choosing distributed filesystem (just how far off topic are wenow ??)
From: "John Barrett" <jbarrett@xxxxxx>
Date: Tue, 4 Nov 2003 14:51:32 -0500
I just did a little research -- if you use replicated mysql servers, this
script may provide a starting point for a much simplified LVS-DR
implementation: http://dlabs.4t2.com/pro/dexter/dexter-info.html -- the
script is designed to layer additional functionality on top of a single
mysql server, but the quick glance that I took shows it could be extended to
route queries to multiple servers based on the type of query.. i.e. read
requests to the local mysql instance, write requests to a mysql instance on
another server.

setup 1 mysql master server, it will not be part of the mysql cluster, its
sole task is to handle insert/update and replicate those requests to the
slave servers.

setup any number of mysql replicated slaves -- they should bind to a VIP on
the "lo" interface, and the kernel should have the hidden ip patch
(ultramonkey kernels for instance)

modify dexter to intercept insert/update requests and redirect them to the
master server (will mean keeping 2 mysql sessions open -- one to the master,
one to the local slave instance) -- if the master isnt there, fail the
update -- install the modified script on all the slave servers

setup the VIP on an ldirectord box and add all the slaves as targets -- 
since mysql connections can be long running, I suggest using LeastConnection
balancing

Now clients can connect to the VIP, the slaves handle all read accesses, and
the master server handles all writes

The only possible issue with this setup is allowing for propogation delays
after insert/update -- i.e. you wont be able to read back the new data
instantly at the slaves -- may take a second or 2 before the slaves have the
data -- your code can loop querying the database to see if the update has
completed if absolutly neccessary

you still have a single point of failure for database updates, but your
database is always backed up on the slaves, and because there is only one
point of update, its very difficult for the slaves to get out of sync, and
read access is very unlikely to fail -- also you have none of the problems
with database locking, as NFS is still not recommended based on the lists
that I scanned to get up to date on the issues

Lastly -- the master CAN be a read server if you wish (my setup above
assumes it is not) given that the update load is not so much that the master
gets overloaded -- if you have high update frequency, then lets the slaves
handle all the reads, and the master only updates


----- Original Message ----- 
From: "John Barrett" <jbarrett@xxxxxx>
To: "LinuxVirtualServer.org users mailing list."
<lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, November 04, 2003 11:11 AM
Subject: Re: Choosing distributed filesystem (just how far off topic are
wenow ??)


>
> >
> > Thanks for your information John! It's very detailed.
> >
> > I feel I rather unsatisfied with the result now. It's said (in many
> > articles) that Coda is better than NFS, especially in replication, and
> > directory naming consistency. Yeah, what a world.
> >
> > May be I should start setting up NFS now.
> >
> > BTW John, may I ask you some questions? Do you use your NFS for database
> > application (along with LVS)? I'm planning to set up a distributed
> > filesystem (whatever it will be) as MySQL back-end for LVS-ed Apache
> > servers. AFAIK, NFS having difficulty in directory naming consistency.
> > Is it an obstacle for your case?
> >
>
> nfs directory naming has not been an issue for me in the least -- I always
> mount nfs volumes as /nfs[n] with subdirs in each volume for specific
> data -- then symlink from where the data should be to the nfs volume -- 
same
> thing you will have to do with coda -- in either case the key is planning
> your nfs/coda setup in advance so that you dont have issues with directory
> layouts, by keeping the mountpoints and symlinks consistent across all the
> machines.
>
> I'm not currently doing replicated database -- i'm relying on
raid5+hotswap
> and frequent database backups to another server using mysqldump+bacula.
> Based on my reading, mysql+nfs not a very good idea in any case -- mysql
has
> file locking support, but it really slows things down because locking
> granularity is at the file level (or was the last time I looked into it -- 
> over a year ago -- please check if there have been any improvements)
>
> based on my current understanding of the art with mysql, your best bet is
to
> use mysql replication and have a failover server used primarily for read
> only until the read/write server fails (if you need additional query
> capacity) (ldirectord solution), or do strict failover (heartbeat
solution),
> only one server active at a time, both writing replication logs, with the
> inactive reading the logs from the active whenever both machines are up
> (some jumping through hoops needed to get mysql to startup both ways -- 
> possibly different my.cnf files based on which server is active)
>
> with either setup -- the worst case scenario is one machine goes down,
then
> the other goes down some period of time later after getting many updates,
> then the out of sync server comes up first without all those updates
>
> (interesting thought just now -- using coda to store the replication logs
> and replicating the coda volume on both the database servers and a 3rd
> server for additional protection, 3rd server may or may not run mysql,
your
> choice of you want to do a "tell me 3 times" setup -- then you just have
to
> keep a file flag that tells which server was most recently active, then
any
> server becoming active can easily check if it needs to merge the
replication
> logs -- but we are going way beyond anything I have ever actually done
here,
> pure theory based on old research)
>
> in either case you are going to have to very carefully test that your
> replication config recovers gracefully from failover/failback scenarios
>
> my current cluster design concept that I'm planning to build next week
might
> give you some ideas on where to go (everything running ultramonkey
kernels,
> in my case, on RH9, with software raid1 ide boot drives, and 3ware raid5
for
> main data storage):
>
> M 1 -- midrange system with large ide raid on 3ware controller,
> raid5+hotswap, coda SCM, bacula network backup software, configured to
> backup directly to disk for first stage, then 2nd stage backup the bacula
> volumes on disk to tape (allows for fast restores from disk, and protects
> the raid array in case of catastophic failure)
>
> M 2&3 heartbeat+ldirectord load balancers in LVS-DR mode -- anything that
> needs load balancing, web, mysql, etc, goes through here (if you are doing
> mysql with a read/write + multiple read only servers, the read only access
> goes through here, write access goes direct to the mysql server and you
will
> need heartbeat on those servers to control possesion of the mysql write
ip,
> and of course your scripts will need to open seperate mysql connections
for
> reading and writing)
>
> M 4 -- mysql server, ide raid + hotswap -- I'm not doing replication, but
we
> already discussed that one :)
>
> then as many web/application servers as you need to do the job :) each one
> is also a replica coda server for the webspace volume, giving replicated
> protection of the webspace and accessability even if the scm and all the
> other webservers are down -- you may want multiple dedicated coda servers
if
> your webspace volume is large enough that having a replicate on each
> webserver would be prohibitivly expensive
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>