LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

RE: Load balanced mail system

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: RE: Load balanced mail system
From: Joseph Mack NA3T <jmack@xxxxxxxx>
Date: Fri, 16 Dec 2005 10:53:34 -0800 (PST)
On Thu, 15 Dec 2005, Graeme Fowler wrote:

On Thu, 2005-12-15 at 12:39 -0800, Joseph Mack NA3T wrote:
Whenever I've had NFS mounts and machines crash, I wind up
with stale file handles, which I can only fix by rebooting
both the server and the client.

Have talked to the local expert about this and have a better understanding now.

You get a stale file handle when the client has a file||directory open and the server stops serving the file||directory. This error is part of the protocol.

client                     server

                           export /home/user/
                           ls /home/user
                                 foo

mount /home/user
cd /home/user/

ls ./
   foo

cd foo
ls
   ..listing of files in foo

                           unexport /home/user/

ls

   stale file handle

df and mount will hang (or possibly return
after a long timeout). The error goes away
when the server comes back.

                           export /home/user

ls
   .. listing of files in foo



The stale file handle will mess up the client till
the server comes back. Since foo is on the same
piece of disk real-estate, it comes back with the same
file handle when the server reappears


An irrecoverable problem:
                           export /home/user

mount /home/user
cd /home/user
ls ./
   foo

(ie as before so far)

do something different, an irreversible failure on the server

                           rmdir /home/user/foo

ls ./

   stale file handle

                           mkdir /home/user/foo

ls ./
   stale file handle


Now when /home/user/foo is recreated, it's on a new piece of disk real-estate and will have a different file handle. The client is hung and you can't umount /home/user (maybe you can with umount -f). If you can't umount /home/user, you will have to reboot the client (in this case the realserver).


In my experience so far, I very occasionally get stale file handles
reported in the logs (Courier IMAP server) but they're nicely shut off
by the NFS client when detected and forgotten about.

hmm, depends on what happens to get the stale file handle. I guess you can be careful deleting files/directories that clients have open.

The servers (the
NetApp filer appliances) don't ever, ever complain about them.

the error doesn't occur at that end

You say your setup is realiable, so maybe you don't have to
deal with the problem, but would your setup survive pulling
a few power cables, waiting 30mins and plugging them back
in?

Unfortunately we've suffered two total power outages (only short, but
still total) in the last twelve months - long, long story - and
everything survived. Even the fact that the realservers came up before
the filers didn't cause a problem, once the filers reappeared the
realservers remounted the filesystems and the show continued.

OK what if the servers went down and the clients (realservers) stayed up, all with open file handles. Presumably they just wait till the servers come up again.

Maybe the Linux NFS client is better now than it used to be? Correction
- it definitely is better than it used to be.

the problem I'm describing is part of the protocol.

Joe

--
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml Homepage http://www.austintek.com/ It's GNU/Linux!

<Prev in Thread] Current Thread [Next in Thread>