[Ocfs2-users] NFS in "D" State

michael.a.jaquays at verizon.com michael.a.jaquays at verizon.com
Thu Mar 18 11:51:22 PDT 2010


Yep, they're all mounted with the nordirplus option.

The scripts dir doesn't seem to be there on that site. 


-Mike Jaquays
W: 972-718-2982
C: 214-587-3882

-----Original Message-----
From: Sunil Mushran [mailto:sunil.mushran at oracle.com] 
Sent: Thursday, March 18, 2010 1:25 PM
To: Jaquays, Michael A.
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] NFS in "D" State

I am assuming you are mounting the nfs mounts with the nordirplus mount option. If not, that is known to deadlock a nfsd thread leading to what you are seeing.

There are two possible reasons for this error. One is a dlm issue.
Other is a local deadlock like above.

To see if the dlm is the cause for the hang, run scanlocks2.
http://oss.oracle.com/~smushran/.dlm/scripts/scanlocks2

This will dump the busy lock resources. Run it a few times. If a lock resource comes up regularly, then it indicates a dlm problem.

Then dump the fs and dlm lock state on that node.
debugfs.ocfs2 -R "fs_locks LOCKNAME" /dev/sdX
debugfs.ocfs2 -R "dlm_locks LOCKNAME" /dev/sdX

The dlm lock will tell you the master node. Repeat the two dumps on the master node. The dlm lock on the master node will point to the current holder. Repeat the same on that node. Email all that to me asap.

michael.a.jaquays at verizon.com wrote:
> All,
>
> I've seen a few posts about this issue in the past, but not a resolution.  I have a 3 node cluster sharing ocfs2 volumes to app nodes via nfs.  On occasion, one of our db nodes will have nfs go into an uninterruptable sleep state.  The nfs daemon is completely useless at this point.  The db node has to be rebooted to resolve.  It seems that nfs is waiting on ocfs2_wait_for_mask.  Any suggestions on a resolution would be appreciated.
>
> root     18387  0.0  0.0      0     0 ?        S<   Mar15   0:00 [nfsd4]
> root     18389  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
> root     18390  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
> root     18391  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
> root     18392  0.0  0.0      0     0 ?        D    Mar15   0:13 [nfsd]
> root     18393  0.0  0.0      0     0 ?        D    Mar15   0:08 [nfsd]
> root     18394  0.0  0.0      0     0 ?        D    Mar15   0:09 [nfsd]
> root     18395  0.0  0.0      0     0 ?        D    Mar15   0:12 [nfsd]
> root     18396  0.0  0.0      0     0 ?        D    Mar15   0:13 [nfsd] 
>
> 18387 nfsd4           worker_thread
> 18389 nfsd            ocfs2_wait_for_mask
> 18390 nfsd            ocfs2_wait_for_mask
> 18391 nfsd            ocfs2_wait_for_mask
> 18392 nfsd            ocfs2_wait_for_mask
> 18393 nfsd            ocfs2_wait_for_mask
> 18394 nfsd            ocfs2_wait_for_mask
> 18395 nfsd            ocfs2_wait_for_mask
> 18396 nfsd            ocfs2_wait_for_mask
>  
>
> -Mike Jaquays
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   




More information about the Ocfs2-users mailing list