[Ocfs2-users] OCFS2, NFS and random Stale NFS file handles

Wed Jul 17 10:10:20 PDT 2013

The problem I have with NFSv3 is that it's difficult to make it work with
iptables. I'll give it a go, however, and see how it affects things.

Also, should I instead be considering iSCSI instead of NFS?

Adam.

On Wed, Jul 17, 2013 at 7:51 AM, Patrick J. LoPresti <patl at patl.com> wrote:

> I would seriously try "nfsvers=3" in those mount options.
>
> In my experience, Linux NFS features take around 10 years before the
> bugs are shaken out. And NFSv4 is much, much more complicated than
> most. (They added a "generation number" to the file handle, but if the
> underlying file system does not implement generation numbers, I have
> no idea what will happen...)
>
>  - Pat
>
> On Wed, Jul 17, 2013 at 7:47 AM, Adam Randall <randalla at gmail.com> wrote:
> > My changes to exports had no effect it seems. I awoke to four errors
> from my
> > processing engine. All of them came from the same server, which makes me
> > curious. I've turned that one off and will see what happens.
> >
> >
> > On Tue, Jul 16, 2013 at 11:22 PM, Adam Randall <randalla at gmail.com>
> wrote:
> >>
> >> I've been doing more digging, and I've changed some of the
> configuration:
> >>
> >> 1) I've changed my nfs mount options to this:
> >>
> >> 192.168.0.160:/mnt/storage                 /mnt/i2xstorage   nfs
> >> defaults,nosuid,noexec,noatime,nodiratime        0 0
> >>
> >> 2) I've changed the /etc/exports for /mnt/storage to this:
> >>
> >>      /mnt/storage -rw,sync,subtree_check,no_root_squash @trusted
> >>
> >> In #1, I've removed nodev, which I think I accidentally copied over
> from a
> >> tmpfs mount point above it when I originally set up the nfs mount point
> so
> >> long ago. Additionally, I added nodiratime. In #2, it used to be
> >> -rw,async,no_subtree_check,no_root_squash. I think the async may be
> causing
> >> what I'm seeing potentially, and the subtree_check should be okay for
> >> testing.
> >>
> >> Hopefully, this will have an effect.
> >>
> >> Adam.
> >>
> >>
> >> On Tue, Jul 16, 2013 at 9:44 PM, Adam Randall <randalla at gmail.com>
> wrote:
> >>>
> >>> Here's various outputs:
> >>>
> >>> # grep nfs /etc/mtab:
> >>> rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
> >>> 192.168.0.160:/var/log/dms /mnt/dmslogs nfs
> >>>
> >>>
> rw,noexec,nosuid,nodev,noatime,vers=4,addr=192.168.0.160,clientaddr=192.168.0.150
> >>> 0 0
> >>> 192.168.0.160:/mnt/storage /mnt/storage nfs
> >>>
> >>>
> rw,noexec,nosuid,nodev,noatime,vers=4,addr=192.168.0.160,clientaddr=192.168.0.150
> >>> 0 0
> >>> # grep nfs /proc/mounts:
> >>> rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> >>> 192.168.0.160:/var/log/dms /mnt/dmslogs nfs4
> >>>
> >>>
> rw,nosuid,nodev,noexec,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.150,local_lock=none,addr=192.168.0.160
> >>> 0 0
> >>> 192.168.0.160:/mnt/storage /mnt/storage nfs4
> >>>
> >>>
> rw,nosuid,nodev,noexec,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.150,local_lock=none,addr=192.168.0.160
> >>> 0 0
> >>>
> >>> Also, the output of df -hT | grep nfs:
> >>> 192.168.0.160:/var/log/dms nfs       273G  5.6G  253G   3%
> /mnt/dmslogs
> >>> 192.168.0.160:/mnt/storage nfs       2.8T  1.8T  986G  65%
> /mnt/storage
> >>>
> >>> >From the looks of it, it appears to be nfs version 4 (though I thought
> >>> that
> >>> I was running version 3, hrm...).
> >>>
> >>> With regards to the ls -lid, one of the directories that wasn't
> altered,
> >>> but for whatever reason was not accessible due to the handler is this:
> >>>
> >>> # ls -lid /mnt/storage/reports/5306
> >>> 185862043 drwxrwxrwx 4 1095 users 45056 Jul 15 21:37
> >>> /mnt/storage/reports/5306
> >>>
> >>> In the directory where we create new documents, which creates a folder
> >>> for each document (legacy decision), it looks something like this:
> >>>
> >>> # ls -lid /mnt/storage/dms/documents/819/* | head -n 10
> >>> 290518712 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39
> >>> /mnt/storage/dms/documents/819/8191174
> >>> 290518714 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39
> >>> /mnt/storage/dms/documents/819/8191175
> >>> 290518716 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39
> >>> /mnt/storage/dms/documents/819/8191176
> >>> 290518718 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39
> >>> /mnt/storage/dms/documents/819/8191177
> >>> 290518720 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39
> >>> /mnt/storage/dms/documents/819/8191178
> >>> 290518722 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:40
> >>> /mnt/storage/dms/documents/819/8191179
> >>> 290518724 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:40
> >>> /mnt/storage/dms/documents/819/8191180
> >>> 290518726 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:47
> >>> /mnt/storage/dms/documents/819/8191181
> >>> 290518728 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:50
> >>> /mnt/storage/dms/documents/819/8191182
> >>> 290518730 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:52
> >>> /mnt/storage/dms/documents/
> >>> 819/8191183
> >>>
> >>> The stale handles seem to appear more when there's load on the system,
> >>> but that's not overly true. I received notice of two failures (both
> from the
> >>> same server) tonight, as seen here:
> >>>
> >>> Jul 16 19:27:40 imaging4 php: Output of: ls -l
> >>> /mnt/storage/dms/documents/819/8191226/ 2>&1:
> >>> Jul 16 19:27:40 imaging4 php:    ls: cannot access
> >>> /mnt/storage/dms/documents/819/8191226/: Stale NFS file handle
> >>> Jul 16 19:44:15 imaging4 php: Output of: ls -l
> >>> /mnt/storage/dms/documents/819/8191228/ 2>&1:
> >>> Jul 16 19:44:15 imaging4 php:    ls: cannot access
> >>> /mnt/storage/dms/documents/819/8191228/: Stale NFS file handle
> >>>
> >>> The above is logged out of my e-mail collecting daemon, which is
> written
> >>> in PHP. When I can't access the directory that was just created, it
> uses
> >>> syslog() to write the above information out.
> >>>
> >>> >From the same server, doing ls -lid I get these for those two
> >>> directories:
> >>>
> >>> 290518819 drwxrwxrwx 2 nobody nobody 3896 Jul 16 19:44
> >>> /mnt/storage/dms/documents/819/8191228
> >>> 290518816 drwxrwxrwx 2 nobody nobody 3896 Jul 16 19:27
> >>> /mnt/storage/dms/documents/819/8191226
> >>>
> >>> Stating the directories showed that the modified times coorespond to
> the
> >>> logs above:
> >>>
> >>> Modify: 2013-07-16 19:27:40.786142391 -0700
> >>> Modify: 2013-07-16 19:44:15.458250738 -0700
> >>>
> >>> By the time it happened, to the time I got back, the stale handle
> cleared
> >>> itself.
> >>>
> >>> If it's at all relevant, this is the fstab:
> >>>
> >>> 192.168.0.160:/var/log/dms                 /mnt/dmslogs      nfs
> >>> defaults,nodev,nosuid,noexec,noatime            0 0
> >>> 192.168.0.160:/mnt/storage                 /mnt/storage      nfs
> >>> defaults,nodev,nosuid,noexec,noatime            0 0
> >>>
> >>> Lastly, in a fit of grasping at straws, I did unmount the ocfs2
> partition
> >>> on the secondary server, and stopped ocfs2 service. I was thinking that
> >>> maybe having it in master/master mode could cause what I was seeing.
> Alas,
> >>> that's not the case as the above errors came after I did that.
> >>>
> >>> Is there anything else that I can provide that might be of help?
> >>>
> >>> Adam.
> >>>
> >>>
> >>>
> >>> On Tue, Jul 16, 2013 at 5:15 PM, Patrick J. LoPresti <
> lopresti at gmail.com>
> >>> wrote:
> >>>>
> >>>> What version is the NFS mount? ("cat /proc/mounts" on the NFS client)
> >>>>
> >>>> NFSv2 only allowed 64 bits in the file handle. With the
> >>>> "subtree_check" option on the NFS server, 32 of those bits are used
> >>>> for the subtree check, leaving only 32 for the inode. (This is from
> >>>> memory; I may have the exact numbers wrong. But the principle
> >>>> applies.)
> >>>>
> >>>> See
> >>>> <
> https://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html#NFS
> >
> >>>>
> >>>> If you run "ls -lid <directory>" for directories that work and those
> >>>> that fail, and you find that the failing directories all have huge
> >>>> inode numbers, that will help confirm that this is the problem.
> >>>>
> >>>> Also if you are using NFSv2 and switch to v3 or set the
> >>>> "no_subtree_check" option and it fixes the problem, that will also
> >>>> help confirm that this is the problem. :-)
> >>>>
> >>>>  - Pat
> >>>>
> >>>>
> >>>> On Tue, Jul 16, 2013 at 5:07 PM, Adam Randall <randalla at gmail.com>
> >>>> wrote:
> >>>> > Please forgive my lack of experience, but I've just recently started
> >>>> > deeply
> >>>> > working with ocfs2 and am not familiar with all it's caveats.
> >>>> >
> >>>> > We've just deployed two servers that have SAN arrays attached to
> them.
> >>>> > These
> >>>> > arrays are synchronized with DRBD in master/master mode, with ocfs2
> >>>> > configured on top of that. In all my testing everything worked well,
> >>>> > except
> >>>> > for an issue with symbolic links throwing an exception in the kernel
> >>>> > (ths
> >>>> > was fixed by applying a patch I found here:
> >>>> > comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008). Of
> these
> >>>> > machines, one of them is designated the master and the other is it's
> >>>> > backup.
> >>>> >
> >>>> > Host is Gentoo linux running the 3.8.13.
> >>>> >
> >>>> > I have four other machines that are connecting to the master ocfs2
> >>>> > partition
> >>>> > using nfs. The problem I'm having is that on these machines, I'm
> >>>> > randomly
> >>>> > getting read errors while trying to enter directories over nfs. In
> all
> >>>> > of
> >>>> > these cases, except on, these directories are immediately
> unavailable
> >>>> > after
> >>>> > they are created. The error that comes back is always something like
> >>>> > this:
> >>>> >
> >>>> > ls: cannot access /mnt/storage/documents/818/8189794/: Stale NFS
> file
> >>>> > handle
> >>>> >
> >>>> > The mount point is /mnt/storage. Other directories on the mount are
> >>>> > available, and on other servers the same directory can be accessed
> >>>> > perfectly
> >>>> > fine.
> >>>> >
> >>>> > I haven't been able to reproduce this issue in isolated testing.
> >>>> >
> >>>> > The four machines that connect via NFS are doing one of two things:
> >>>> >
> >>>> > 1) processing e-mail through a php driven daemon (read and write,
> >>>> > creating
> >>>> > directories)
> >>>> > 2) serving report files in PDF format over the web via a php web
> >>>> > application
> >>>> > (read only)
> >>>> >
> >>>> > I believe that the ocfs2 version if 1.5. I found this in the kernel
> >>>> > source
> >>>> > itself, but haven't figured out how to determine this in the shell.
> >>>> > ocfs2-tools is version 1.8.2, which is what ocfs2 wanted (maybe this
> >>>> > is
> >>>> > ocfs2 1.8 then?).
> >>>> >
> >>>> > The only other path I can think to take is to abandon OCFS2 and use
> >>>> > DRBD in
> >>>> > master/slave mode with ext4 on top of that. This would still provide
> >>>> > me with
> >>>> > the redundancy I want, but at a lack of not being able to use both
> >>>> > machines
> >>>> > simultaneously.
> >>>> >
> >>>> > If anyone has any advice, I'd love to hear it.
> >>>> >
> >>>> > Thanks in advance,
> >>>> >
> >>>> > Adam.
> >>>> >
> >>>> >
> >>>> > --
> >>>> > Adam Randall
> >>>> > http://www.xaren.net
> >>>> > AIM: blitz574
> >>>> > Twitter: @randalla0622
> >>>> >
> >>>> > "To err is human... to really foul up requires the root password."
> >>>> >
> >>>> > _______________________________________________
> >>>> > Ocfs2-users mailing list
> >>>> > Ocfs2-users at oss.oracle.com
> >>>> > https://oss.oracle.com/mailman/listinfo/ocfs2-users
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Adam Randall
> >>> http://www.xaren.net
> >>> AIM: blitz574
> >>> Twitter: @randalla0622
> >>>
> >>> "To err is human... to really foul up requires the root password."
> >>
> >>
> >>
> >>
> >> --
> >> Adam Randall
> >> http://www.xaren.net
> >> AIM: blitz574
> >> Twitter: @randalla0622
> >>
> >> "To err is human... to really foul up requires the root password."
> >
> >
> >
> >
> > --
> > Adam Randall
> > http://www.xaren.net
> > AIM: blitz574
> > Twitter: @randalla0622
> >
> > "To err is human... to really foul up requires the root password."
> >
> > _______________________________________________
> > Ocfs2-users mailing list
> > Ocfs2-users at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/ocfs2-users
>

-- 
Adam Randall
http://www.xaren.net
AIM: blitz574
Twitter: @randalla0622

"To err is human... to really foul up requires the root password."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130717/6e347199/attachment-0001.html