Lorn Kay <lorn_kay@xxxxxxxxxxx> said:
>
> Other list members may have a comment for you about DRDB or GFS, or a
shared
> SCSI disk. I wonder though if you couldn't use RSYNC to do a reasonable job
> of syncing the two NAS servers (much like a NAS vendor's snap shot
software)
> and perhaps heartbeat between them so that only one of them is active at a
> time (only one of them owns the NFS server resource) if your data isn't
> changing too rapidly.
I'm currently fixing a cluster for a customer. The original guy (who seems to
have faded away after getting paid but never getting the system to work
properly) setup DRDB 0.4.9 and never did get it operating. I took a good hard
look at it, and it's just way too early in it's development cycle for me to
use in a production system.
As for GFS, these entries in the latest Changelog stopped me from using it:
"o GFS will currently panic if it encounters I/O errors. Code will be added
in the future to allow the GFS node to remain functional and have the
filesystem cleanly shutdown on errors rather than panicking the entire
system.
o Programs which scan the entire filesystem are prohibitively slow.
The most common example of this is used to be the "df" command.
"df" performance has been dramatically improved via the use of LVBs so
that is no longer true. Unfortunately, other commands such as "du"
still are sub-optimal and work is underway to determine what can be
done to improve their performance.
The basic problem here is that local filesystems (Ext2FS) or server based
filesystems (NFS) can easily cache changes to the filesystem and maintain a
single coherent snapshot of the state of the filesystem. GFS on the other
hand must either build a coherent snapshot by querying every node in the
GFS cluster and summarizing this information or walk the entire filesystem
to scan the meta-data. Due to this current limitation it is recommended
that GFS partitions be excluded from scans which build the "locate"
database. Solutions to these problems are being investigated (i.e.
``fuzzy'' du/df -- will be much faster but not 100% accurate -- within 5%).
o GFS does not support quotas currently. This is hard problem and is
similar in nature to the ``du''/``df'' issue discussed earlier.
Methods of distributed and scalable cluster-wide quota support are being
investigated (i.e. ``fuzzy'' quotes)."
I haven't tested it to see if rsync has the same problems as du/df, but I
wouldn't bet against it. Admittedly these problems really relate to using GFS
as a distributed filesystem, with different data on different nodes, not
specifically to using it as a simple "network raid 1".
For the record, I've also taken a look at Coda, OpenAFS and a few others that
slip my mind at the moment, and for various reasons have found them all
lacking in some significant way or another. I've essentially concluded that
there is no "commercial quality" distributed filesystem for Linux. At this
point, I'm running a setup with two servers using SGI's XFS and rsync. I'm
still in the lab playing with various methods of monitoring/failover.
Even monitoring NFS for availability in order to execute a failover has been
a problem (hence my earlier question on the list). The Linux FailSafe project
looks somewhat promising in this area, but I haven't had a chance to try it
yet.
-=dwh=-
________________________________________________________________
http://www.OpenRecording.com For musicians by musicians.
Now with free Web-Based email too!
|