[Ocfs2-users] NFS Failover

Tue Dec 9 07:01:17 PST 2008

Sunil,

   They are not waiting, the kernel reconnects after a few seconds, but just dont like the other nfs server, any attempt to access directories or files after the virtual IP failover to the other nfs server was resulting in errors. Unfortunatelly I dont have the exact error message here anymore.

   We found a parameter on the nfs server that seems to fix it, fsid. If you set this to the same number on both servers it forces both of them to use the same identifiers. Seems that if you dont, you need to guarantee that the mount is done on the same device on both servers, and we cannot do this since we are using powerpath. 

  I would like to confirm if the inode numbers are consistent accross servers?

  That is:

[oracle at br001sv0440 concurrents]$ ls -il
total 8
131545 drwxr-xr-x  2  100 users 4096 Dec  9 12:12 admin
131543 drwxrwxrwx  2 root dba   4096 Dec  4 08:53 lost+found
[oracle at br001sv0440 concurrents]$

   Directory "admin" (Or other directories/files) is always be inode number 131545, no mater on what server we are? Seems to be so, but I would like to confirm.

   About the metadata changes, this share will be used for log files (Actually, for a Oracle eBusiness Suite concurrent log and output files), so we can tolerate if a few of the latest files are lost during the failover. The user can simply run his report again. Also if some processes hang or die during the failover it can be tolerated, as the internal manager can restart them. Preferably processes should die instead of hanging.

   But I am concerned about dangling locks on the server. Not sure on how to handle those. On the NFS-HA docs some files on /var/lib/nfs are copied using scripts every few seconds, but this does not seem to be a foolprof way. 

   I overviewed the docs from NFS-HA sent to the list, they are usefull, but also very "Linux HA" centric, and require the heartbeat2 package. I wont install another cluster stack, since I already have CRS here. 

   Do anyone has pointers on a similar setup with CRS?

Best Regards,
Luis

--- On Mon, 12/8/08, Sunil Mushran <sunil.mushran at oracle.com> wrote:

> From: Sunil Mushran <sunil.mushran at oracle.com>
> Subject: Re: [Ocfs2-users] NFS Failover
> To: lfreitas34 at yahoo.com
> Cc: ocfs2-users at oss.oracle.com
> Date: Monday, December 8, 2008, 11:47 PM
> While the nfs protocol is stateless and thus should handle
> failing-over,
> the procedures themselves are synchronous. Meaning, I am
> not sure
> how a nfs client will handle getting a ok for some metadata
> change
> (mkdir, etc) just before a server dies and is recovered by
> another node.
> If the op did not make it to the journal, it would be a
> null op. But the
> nfs client would not know that as the server has
> failed-over.
> This is a qs for nfs.
> 
> What is the stack of the nfs clients? As in, what are they
> waiting on?
> 
> Luis Freitas wrote:
> > Hi list,
> >
> >    I need to implement a High available NFS server.
> Since we already have OCFS2 here for RAC, and already have a
> virtual IP on the RAC server that failovers automatically to
> the other node, it seems a natural choice to use it too for
> our NFS needs. We are using OCFS2 1.2. (Upgrade to 1.4 is
> not on our current plans)
> >
> >    We did a preliminary failover test, and the client
> that mounts the filesystem (Actually a solaris box) doesnt
> like the failover. We expect some errors and minor data loss
> and can tolerate them as a transient condition, but the
> problem is that the mounted filesystem on the client becomes
> useless until we umount and remount it again.
> >
> >    I suspect that NFS uses inode numbers on underlying
> filesystem to create "handles" that it passes on
> to clients, but I am not sure on how this is done.
> >
> >    Anyone know if we can achieve a failover without
> needing to remount he nfs share on the clients? Any special
> options are needed mounting the OCFS2 filesystem and also
> for exporting it as NFS, or on the client?
> >
> > Best Regards,
> > Luis
> >