[Ocfs2-users] NFS Failover

Mon Dec 8 17:47:55 PST 2008

While the nfs protocol is stateless and thus should handle failing-over,
the procedures themselves are synchronous. Meaning, I am not sure
how a nfs client will handle getting a ok for some metadata change
(mkdir, etc) just before a server dies and is recovered by another node.
If the op did not make it to the journal, it would be a null op. But the
nfs client would not know that as the server has failed-over.
This is a qs for nfs.

What is the stack of the nfs clients? As in, what are they waiting on?

Luis Freitas wrote:
> Hi list,
>
>    I need to implement a High available NFS server. Since we already have OCFS2 here for RAC, and already have a virtual IP on the RAC server that failovers automatically to the other node, it seems a natural choice to use it too for our NFS needs. We are using OCFS2 1.2. (Upgrade to 1.4 is not on our current plans)
>
>    We did a preliminary failover test, and the client that mounts the filesystem (Actually a solaris box) doesnt like the failover. We expect some errors and minor data loss and can tolerate them as a transient condition, but the problem is that the mounted filesystem on the client becomes useless until we umount and remount it again.
>
>    I suspect that NFS uses inode numbers on underlying filesystem to create "handles" that it passes on to clients, but I am not sure on how this is done.
>
>    Anyone know if we can achieve a failover without needing to remount he nfs share on the clients? Any special options are needed mounting the OCFS2 filesystem and also for exporting it as NFS, or on the client?
>
> Best Regards,
> Luis
>