Re: newbie, which need an urgent answer

To:	<lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject:	Re: newbie, which need an urgent answer
From:	"Don Hinshaw" <dwh@xxxxxxxxxxxxxxxxx>
Date:	Sun, 12 Aug 2001 11:51:51 -0000

Lorn Kay <lorn_kay@xxxxxxxxxxx> said:

> 
> Other list members may have a comment for you about DRDB or GFS, or a 
shared 
> SCSI disk. I wonder though if you couldn't use RSYNC to do a reasonable job 
> of syncing the two NAS servers (much like a NAS vendor's snap shot 
software) 
> and perhaps heartbeat between them so that only one of them is active at a 
> time (only one of them owns the NFS server resource) if your data isn't 
> changing too rapidly.

I'm currently fixing a cluster for a customer. The original guy (who seems to 
have faded away after getting paid but never getting the system to work 
properly) setup DRDB 0.4.9 and never did get it operating. I took a good hard 
look at it, and it's just way too early in it's development cycle for me to 
use in a production system.

As for GFS, these entries in the latest Changelog stopped me from using it:

"o GFS will currently panic if it encounters I/O errors.  Code will be added
  in the future to allow the GFS node to remain functional and have the
  filesystem cleanly shutdown on errors rather than panicking the entire
  system.

o Programs which scan the entire filesystem are prohibitively slow.
  The most common example of this is used to be the "df" command.
  "df" performance has been dramatically improved via the use of LVBs so
  that is no longer true.  Unfortunately, other commands such as "du"
  still are sub-optimal and work is underway to determine what can be
  done to improve their performance.

  The basic problem here is that local filesystems (Ext2FS) or server based
  filesystems (NFS) can easily cache changes to the filesystem and maintain a
  single coherent snapshot of the state of the filesystem.  GFS on the other
  hand must either build  a coherent snapshot by querying every node in the
  GFS cluster and summarizing this information or walk the entire filesystem
  to scan the meta-data.  Due to this current limitation it is recommended
  that GFS partitions be excluded from scans which build the "locate"
  database.  Solutions to these problems are being investigated (i.e.
  ``fuzzy'' du/df -- will be much faster but not 100% accurate -- within 5%).

o GFS does not support quotas currently.  This is hard problem and is
  similar in nature to the ``du''/``df'' issue discussed earlier.
  Methods of distributed and scalable cluster-wide quota support are being
  investigated (i.e. ``fuzzy'' quotes)."

I haven't tested it to see if rsync has the same problems as du/df, but I 
wouldn't bet against it. Admittedly these problems really relate to using GFS 
as a distributed filesystem, with different data on different nodes, not 
specifically to using it as a simple "network raid 1".

For the record, I've also taken a look at Coda, OpenAFS and a few others that 
slip my mind at the moment, and for various reasons have found them all 
lacking in some significant way or another. I've essentially concluded that 
there is no "commercial quality" distributed filesystem for Linux. At this 
point, I'm running a setup with two servers using SGI's XFS and rsync. I'm 
still in the lab playing with various methods of monitoring/failover.

Even monitoring NFS for availability in order to execute a failover has been 
a problem (hence my earlier question on the list). The Linux FailSafe project 
looks somewhat promising in this area, but I haven't had a chance to try it 
yet.

-=dwh=-

________________________________________________________________
http://www.OpenRecording.com For musicians by musicians.
Now with free Web-Based email too!

<Prev in Thread]	Current Thread	[Next in Thread>
newbie, which need an urgent answer, Filipe Carvalho Re: newbie, which need an urgent answer, Lorn Kay Re: newbie, which need an urgent answer, Filipe Carvalho Re: newbie, which need an urgent answer, Lorn Kay Re: newbie, which need an urgent answer, John Cronin Re: newbie, which need an urgent answer, Don Hinshaw <=

Previous by Date:	Re: (no subject), Julian Anastasov
Next by Date:	Re: Multiple NICS - Real Servers, Don Hinshaw
Previous by Thread:	Re: newbie, which need an urgent answer, John Cronin
Next by Thread:	FW: Problems with LVS-DR/FWMARK and director as gateway, Jake Garver
Indexes:	[Date] [Thread] [Top] [All Lists]