[Ocfs2-users] huge "something" problem

Sunil Mushran Sunil.Mushran at oracle.com
Mon Jun 2 10:51:27 PDT 2008


What about the other node?

The function indicates that it is waiting on the o2dlm. If so,
scanlocks should show a "Busy" lockres.

Do this by hand on both nodes.
1. debugfs.ocfs2 -R "fs_locks" /dev/sdX >/tmp/locks.out
2. grep -i busy /tmp/locks.out

Best log a bugzilla with all this information. Even if we cannot
resolve this right now, the information will help us in the future.

Repeat the same on both nodes.

The one so-called bummer is that 2.6.23 does not have any dlm
debugging. It will be available from 2.6.26 onwards. Meaning,
even if we get a lockres, we will not be able to dump the state
in 2.6.23.

Alexandre Racine wrote:
> Hi Sunil,
>
> Below you will find the result of the command "$ ps -e -o
> pid,stat,comm,wchan=WIDE-WCHAN-COLUMN" ran 6 times with more or less 10
> sec. delay.
>
> Now, from my point of view, I have a load of 8.00 and nfsd is running 8
> instances witch are all in the "D" state. I am kind of making links here
> ;) What does the "ocfs2_wait_for_mask" means? Is there some problems
> with OCFS2 and NFS? (NFS is somehow not a perfect FS)
>
> Thanks
>
>
>
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5321 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           -
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5322 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           -
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5326 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           -
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5327 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           -
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5328 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           9249877474039267586
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>   PID STAT COMMAND         WIDE-WCHAN-COLUMN
>     1 Ss   init            -
>     2 S<   kthreadd        kthreadd
>     3 S<   migration/0     migration_thread
>     4 S<   ksoftirqd/0     ksoftirqd
>     5 S<   watchdog/0      watchdog
>     6 S<   migration/1     migration_thread
>     7 S<   ksoftirqd/1     ksoftirqd
>     8 S<   watchdog/1      watchdog
>     9 S<   migration/2     migration_thread
>    10 S<   ksoftirqd/2     ksoftirqd
>    11 S<   watchdog/2      watchdog
>    12 S<   migration/3     migration_thread
>    13 S<   ksoftirqd/3     ksoftirqd
>    14 S<   watchdog/3      watchdog
>    15 S<   events/0        worker_thread
>    16 S<   events/1        worker_thread
>    17 S<   events/2        worker_thread
>    18 S<   events/3        worker_thread
>    19 S<   khelper         worker_thread
>   105 S<   kblockd/0       worker_thread
>   106 S<   kblockd/1       worker_thread
>   107 S<   kblockd/2       worker_thread
>   108 S<   kblockd/3       worker_thread
>   112 S<   kacpid          worker_thread
>   113 S<   kacpi_notify    worker_thread
>   240 S<   ata/0           worker_thread
>   241 S<   ata/1           worker_thread
>   242 S<   ata/2           worker_thread
>   243 S<   ata/3           worker_thread
>   244 S<   ata_aux         worker_thread
>   245 S<   ksuspend_usbd   worker_thread
>   251 S<   khubd           hub_thread
>   254 S<   kseriod         serio_thread
>   318 S    pdflush         pdflush
>   319 S    pdflush         pdflush
>   320 S<   kswapd0         kswapd
>   321 S<   aio/0           worker_thread
>   322 S<   aio/1           worker_thread
>   323 S<   aio/2           worker_thread
>   324 S<   aio/3           worker_thread
>   325 S<   cifsoplockd     -
>   326 S<   cifsdnotifyd    -
>   327 S<   jfsIO           jfsIOWait
>   328 S<   jfsCommit       jfs_lazycommit
>   329 S<   jfsCommit       jfs_lazycommit
>   330 S<   jfsCommit       jfs_lazycommit
>   331 S<   jfsCommit       jfs_lazycommit
>   332 S<   jfsSync         jfs_sync
>   333 S<   xfslogd/0       worker_thread
>   334 S<   xfslogd/1       worker_thread
>   335 S<   xfslogd/2       worker_thread
>   336 S<   xfslogd/3       worker_thread
>   337 S<   xfsdatad/0      worker_thread
>   338 S<   xfsdatad/1      worker_thread
>   339 S<   xfsdatad/2      worker_thread
>   340 S<   xfsdatad/3      worker_thread
>   341 S<   xfs_mru_cache   worker_thread
>   697 Ss   sshd            -
>   700 S    sshd            -
>   701 Ss+  bash            -
>  1035 S<   scsi_eh_0       scsi_error_handler
>  1066 S<   scsi_eh_1       scsi_error_handler
>  1068 S<   scsi_eh_2       scsi_error_handler
>  1078 S<   khpsbpkt        hpsbpkt_thread
>  1157 S<   scsi_eh_3       scsi_error_handler
>  1158 S<   usb-storage     -
>  1161 S<   scsi_eh_4       scsi_error_handler
>  1162 S<   usb-storage     -
>  1177 S<   kpsmoused       worker_thread
>  1180 S<   kondemand/0     worker_thread
>  1181 S<   kondemand/1     worker_thread
>  1182 S<   kondemand/2     worker_thread
>  1183 S<   kondemand/3     worker_thread
>  1199 S<   rpciod/0        worker_thread
>  1200 S<   rpciod/1        worker_thread
>  1201 S<   rpciod/2        worker_thread
>  1202 S<   rpciod/3        worker_thread
>  1203 S<   reiserfs/0      worker_thread
>  1204 S<   reiserfs/1      worker_thread
>  1205 S<   reiserfs/2      worker_thread
>  1206 S<   reiserfs/3      worker_thread
>  1298 S<s  udevd           -
>  2988 Ss   sshd            -
>  2990 S    sshd            -
>  2991 Ss   bash            wait
>  3118 S<   ocfs2_wq        worker_thread
>  3143 S<   xfsbufd         -
>  3144 S<   xfssyncd        -
>  3151 S<   user_dlm        worker_thread
>  5329 R+   ps              -
>  5437 Ss   syslog-ng       631360160003
>  5557 Ss   iscsid          -
>  5558 S<Ls iscsid          16941990775338565631
>  5566 S<   scsi_eh_5       scsi_error_handler
>  5567 S<   scsi_wq_5       worker_thread
>  5711 Ss   portmap         -
>  5770 Ss   rpc.statd       -
>  5833 Ss   rpc.mountd      -
>  5835 S    lockd           -
>  5836 D    nfsd            ocfs2_wait_for_mask
>  5837 D    nfsd            ocfs2_wait_for_mask
>  5838 D    nfsd            ocfs2_wait_for_mask
>  5839 D    nfsd            ocfs2_wait_for_mask
>  5840 D    nfsd            ocfs2_wait_for_mask
>  5841 D    nfsd            ocfs2_wait_for_mask
>  5842 D    nfsd            ocfs2_wait_for_mask
>  5843 D    nfsd            ocfs2_wait_for_mask
>  5902 S<   o2net           worker_thread
>  5956 S<   o2hb-41535574BD -
>  5957 S<   ocfs2vote       ocfs2_vote_thread
>  5958 S<   dlm_thread      -
>  5959 S<   dlm_reco_thread -
>  5960 S<   dlm_wq          worker_thread
>  5961 S<   kjournald       kjournald
>  5962 S<   ocfs2cmt        ocfs2_commit_thread
>  6025 Ssl  mysqld          -
>  6038 Ss   apache2         -
>  6049 S    apache2         semtimedop
>  6051 S    apache2         -
>  6052 S    apache2         semtimedop
>  6053 S    apache2         semtimedop
>  6097 Ss   sshd            -
>  6115 S    apache2         semtimedop
>  6117 S    apache2         semtimedop
>  6257 Sl   gmetad          -
>  6332 Ss   gmond           -
>  6389 Ss   nrpe            -
>  6403 Ss   collector.pl    -
>  6468 Ssl  nscd            -
>  6533 Ss   ntpd            -
>  6590 Ss   smbd            -
>  6596 S    smbd            pause
>  6600 Ss   nmbd            -
>  6609 Ss   winbindd        -
>  6613 SL   winbindd        -
>  6667 Ss   cron            -
>  6904 S    sge_execd       -
>  6917 Ss+  agetty          -
>  6918 Ss+  agetty          -
>  6919 Ss+  agetty          -
>  6920 Ss+  agetty          -
>  6921 Ss+  agetty          -
>  6922 Ss+  agetty          -
>  7496 S    winbindd        -
>  7497 S    winbindd        -
>  7498 S    winbindd        -
> 11379 S    apache2         semtimedop
> 11382 S    apache2         semtimedop
> 11383 S    apache2         semtimedop
> 24045 S    apache2         semtimedop
>
>
> Alexandre Racine
> alexandre.racine at mhicc.org
> 514-461-1300 poste 3303
>
>
>   
>> -----Original Message-----
>> From: Sunil Mushran [mailto:Sunil.Mushran at oracle.com]
>> Sent: 2 juin 2008 12:13
>> To: Alexandre Racine
>> Cc: ocfs2-users at oss.oracle.com
>> Subject: Re: [Ocfs2-users] huge "something" problem
>>
>> That means o2dlm is not the cause for the process hang.
>>
>> Next, run ps:
>> $ ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN
>>
>> Run it 6 times in 10 sec interval. This should tell us the processes
>> in D state and the location in kernel. Hopefully.
>>
>> The next step is to to get the stack trace. alt-sysrq-t. That is more
>> invasive... so won't recommend it yet.
>>
>> Alexandre Racine wrote:
>>     
>>> Ok. I have the same problem now (load of server at 8.00, no
>>>       
>> processors
>>     
>>> higher then 5% of utilization, and a user can't access his folder).
>>>
>>> I did your commands, but there no real data here...
>>>
>>> racinea at srv2 /mnt/data/testOCFS2 $ sudo ./scanlocks2.sh
>>> racinea@ srv2 /mnt/data/testOCFS2 $ w
>>>  10:16:44 up 9 days, 19:10,  2 users,  load average: 8.43, 8.36,
>>>       
> 8.14
>   
>>> USER     TTY        LOGIN@   IDLE   JCPU   PCPU WHAT
>>> racinea  pts/1     10:13    0.00s  0.02s  0.00s w
>>> racinea@ srv2 /mnt/data/testOCFS2 $ sudo ./listdomains.sh
>>> 41535574BDEB4720B2CE7819A631DF10  /dev/sdd
>>> /home
>>>
>>>
>>> What else could I try?
>>> Thanks.
>>>
>>>
>>>
>>> Alexandre Racine
>>> alexandre.racine at mhicc.org
>>> 514-461-1300 poste 3303
>>>
>>>       




More information about the Ocfs2-users mailing list