LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: Choosing distributed filesystem (just how far off topic are we now ?

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: Choosing distributed filesystem (just how far off topic are we now ??)
From: "John Barrett" <jbarrett@xxxxxx>
Date: Tue, 4 Nov 2003 11:11:08 -0500
>
> Thanks for your information John! It's very detailed.
>
> I feel I rather unsatisfied with the result now. It's said (in many
> articles) that Coda is better than NFS, especially in replication, and
> directory naming consistency. Yeah, what a world.
>
> May be I should start setting up NFS now.
>
> BTW John, may I ask you some questions? Do you use your NFS for database
> application (along with LVS)? I'm planning to set up a distributed
> filesystem (whatever it will be) as MySQL back-end for LVS-ed Apache
> servers. AFAIK, NFS having difficulty in directory naming consistency.
> Is it an obstacle for your case?
>

nfs directory naming has not been an issue for me in the least -- I always
mount nfs volumes as /nfs[n] with subdirs in each volume for specific
data -- then symlink from where the data should be to the nfs volume -- same
thing you will have to do with coda -- in either case the key is planning
your nfs/coda setup in advance so that you dont have issues with directory
layouts, by keeping the mountpoints and symlinks consistent across all the
machines.

I'm not currently doing replicated database -- i'm relying on raid5+hotswap
and frequent database backups to another server using mysqldump+bacula.
Based on my reading, mysql+nfs not a very good idea in any case -- mysql has
file locking support, but it really slows things down because locking
granularity is at the file level (or was the last time I looked into it -- 
over a year ago -- please check if there have been any improvements)

based on my current understanding of the art with mysql, your best bet is to
use mysql replication and have a failover server used primarily for read
only until the read/write server fails (if you need additional query
capacity) (ldirectord solution), or do strict failover (heartbeat solution),
only one server active at a time, both writing replication logs, with the
inactive reading the logs from the active whenever both machines are up
(some jumping through hoops needed to get mysql to startup both ways -- 
possibly different my.cnf files based on which server is active)

with either setup -- the worst case scenario is one machine goes down, then
the other goes down some period of time later after getting many updates,
then the out of sync server comes up first without all those updates

(interesting thought just now -- using coda to store the replication logs
and replicating the coda volume on both the database servers and a 3rd
server for additional protection, 3rd server may or may not run mysql, your
choice of you want to do a "tell me 3 times" setup -- then you just have to
keep a file flag that tells which server was most recently active, then any
server becoming active can easily check if it needs to merge the replication
logs -- but we are going way beyond anything I have ever actually done here,
pure theory based on old research)

in either case you are going to have to very carefully test that your
replication config recovers gracefully from failover/failback scenarios

my current cluster design concept that I'm planning to build next week might
give you some ideas on where to go (everything running ultramonkey kernels,
in my case, on RH9, with software raid1 ide boot drives, and 3ware raid5 for
main data storage):

M 1 -- midrange system with large ide raid on 3ware controller,
raid5+hotswap, coda SCM, bacula network backup software, configured to
backup directly to disk for first stage, then 2nd stage backup the bacula
volumes on disk to tape (allows for fast restores from disk, and protects
the raid array in case of catastophic failure)

M 2&3 heartbeat+ldirectord load balancers in LVS-DR mode -- anything that
needs load balancing, web, mysql, etc, goes through here (if you are doing
mysql with a read/write + multiple read only servers, the read only access
goes through here, write access goes direct to the mysql server and you will
need heartbeat on those servers to control possesion of the mysql write ip,
and of course your scripts will need to open seperate mysql connections for
reading and writing)

M 4 -- mysql server, ide raid + hotswap -- I'm not doing replication, but we
already discussed that one :)

then as many web/application servers as you need to do the job :) each one
is also a replica coda server for the webspace volume, giving replicated
protection of the webspace and accessability even if the scm and all the
other webservers are down -- you may want multiple dedicated coda servers if
your webspace volume is large enough that having a replicate on each
webserver would be prohibitivly expensive

<Prev in Thread] Current Thread [Next in Thread>