[Ocfs2-users] NFS Failover

Sunil Mushran sunil.mushran at oracle.com
Tue Dec 9 11:09:54 PST 2008


I forgot about fsid. That's how it identifies the device. Yes, it needs
to be the same.

Yes, the inode numbers are consistent. It is the block number
of the inode on disk.

Afraid cannot help you with failover lockd.

Sunil

Luis Freitas wrote:
> Sunil,
>
>    They are not waiting, the kernel reconnects after a few seconds, but just dont like the other nfs server, any attempt to access directories or files after the virtual IP failover to the other nfs server was resulting in errors. Unfortunatelly I dont have the exact error message here anymore.
>
>    We found a parameter on the nfs server that seems to fix it, fsid. If you set this to the same number on both servers it forces both of them to use the same identifiers. Seems that if you dont, you need to guarantee that the mount is done on the same device on both servers, and we cannot do this since we are using powerpath. 
>
>   I would like to confirm if the inode numbers are consistent accross servers?
>
>   That is:
>
> [oracle at br001sv0440 concurrents]$ ls -il
> total 8
> 131545 drwxr-xr-x  2  100 users 4096 Dec  9 12:12 admin
> 131543 drwxrwxrwx  2 root dba   4096 Dec  4 08:53 lost+found
> [oracle at br001sv0440 concurrents]$
>
>    Directory "admin" (Or other directories/files) is always be inode number 131545, no mater on what server we are? Seems to be so, but I would like to confirm.
>
>
>    About the metadata changes, this share will be used for log files (Actually, for a Oracle eBusiness Suite concurrent log and output files), so we can tolerate if a few of the latest files are lost during the failover. The user can simply run his report again. Also if some processes hang or die during the failover it can be tolerated, as the internal manager can restart them. Preferably processes should die instead of hanging.
>
>    But I am concerned about dangling locks on the server. Not sure on how to handle those. On the NFS-HA docs some files on /var/lib/nfs are copied using scripts every few seconds, but this does not seem to be a foolprof way. 
>
>    I overviewed the docs from NFS-HA sent to the list, they are usefull, but also very "Linux HA" centric, and require the heartbeat2 package. I wont install another cluster stack, since I already have CRS here. 
>
>    Do anyone has pointers on a similar setup with CRS?
>
> Best Regards,
> Luis
>   



More information about the Ocfs2-users mailing list