[Ocfs2-users] OCFS2, NFS and random Stale NFS file handles

Patrick J. LoPresti lopresti at gmail.com
Tue Jul 16 17:15:02 PDT 2013


What version is the NFS mount? ("cat /proc/mounts" on the NFS client)

NFSv2 only allowed 64 bits in the file handle. With the
"subtree_check" option on the NFS server, 32 of those bits are used
for the subtree check, leaving only 32 for the inode. (This is from
memory; I may have the exact numbers wrong. But the principle
applies.)

See <https://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html#NFS>

If you run "ls -lid <directory>" for directories that work and those
that fail, and you find that the failing directories all have huge
inode numbers, that will help confirm that this is the problem.

Also if you are using NFSv2 and switch to v3 or set the
"no_subtree_check" option and it fixes the problem, that will also
help confirm that this is the problem. :-)

 - Pat


On Tue, Jul 16, 2013 at 5:07 PM, Adam Randall <randalla at gmail.com> wrote:
> Please forgive my lack of experience, but I've just recently started deeply
> working with ocfs2 and am not familiar with all it's caveats.
>
> We've just deployed two servers that have SAN arrays attached to them. These
> arrays are synchronized with DRBD in master/master mode, with ocfs2
> configured on top of that. In all my testing everything worked well, except
> for an issue with symbolic links throwing an exception in the kernel (ths
> was fixed by applying a patch I found here:
> comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008). Of these
> machines, one of them is designated the master and the other is it's backup.
>
> Host is Gentoo linux running the 3.8.13.
>
> I have four other machines that are connecting to the master ocfs2 partition
> using nfs. The problem I'm having is that on these machines, I'm randomly
> getting read errors while trying to enter directories over nfs. In all of
> these cases, except on, these directories are immediately unavailable after
> they are created. The error that comes back is always something like this:
>
> ls: cannot access /mnt/storage/documents/818/8189794/: Stale NFS file handle
>
> The mount point is /mnt/storage. Other directories on the mount are
> available, and on other servers the same directory can be accessed perfectly
> fine.
>
> I haven't been able to reproduce this issue in isolated testing.
>
> The four machines that connect via NFS are doing one of two things:
>
> 1) processing e-mail through a php driven daemon (read and write, creating
> directories)
> 2) serving report files in PDF format over the web via a php web application
> (read only)
>
> I believe that the ocfs2 version if 1.5. I found this in the kernel source
> itself, but haven't figured out how to determine this in the shell.
> ocfs2-tools is version 1.8.2, which is what ocfs2 wanted (maybe this is
> ocfs2 1.8 then?).
>
> The only other path I can think to take is to abandon OCFS2 and use DRBD in
> master/slave mode with ext4 on top of that. This would still provide me with
> the redundancy I want, but at a lack of not being able to use both machines
> simultaneously.
>
> If anyone has any advice, I'd love to hear it.
>
> Thanks in advance,
>
> Adam.
>
>
> --
> Adam Randall
> http://www.xaren.net
> AIM: blitz574
> Twitter: @randalla0622
>
> "To err is human... to really foul up requires the root password."
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users



More information about the Ocfs2-users mailing list