[Ocfs2-users] ocfs2 - Kernel panic on many write/read from both servers

Marek Królikowski admin at wset.edu.pl
Sun Dec 4 02:15:56 PST 2011


I do for all night tests with write/read files from ocfs2 on both servers 
something like this:
On MAIL1 server:
#!/bin/bash
while true
do
rm -rf /mnt/EMC/MAIL1
mkdir /mnt/EMC/MAIL1
cp -r /usr /mnt/EMC/MAIL1
rm -rf /mnt/EMC/MAIL1
done;
On MAIL2 server:
#!/bin/bash
while true
do
rm -rf /mnt/EMC/MAIL2
mkdir /mnt/EMC/MAIL2
cp -r /usr /mnt/EMC/MAIL2
rm -rf /mnt/EMC/MAIL2
done;

Today i check logs and see:
o2dlm: Node 1 joins domain EAC7942B71964050AE2046D3F0CDD7B2
o2dlm: Nodes in domain EAC7942B71964050AE2046D3F0CDD7B2: 0 1
(rm,26136,0):ocfs2_unlink:953 ERROR: status = -2
(touch,26137,0):ocfs2_check_dir_for_entry:2120 ERROR: status = -17
(touch,26137,0):ocfs2_mknod:461 ERROR: status = -17
(touch,26137,0):ocfs2_create:631 ERROR: status = -17
(rm,26142,0):ocfs2_unlink:953 ERROR: status = -2
INFO: task kworker/u:2:20246 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff88107f4525c0     0 20246      2 0x00000000
ffff880b730b57d0 0000000000000046 ffff8810201297d0 00000000000125c0
ffff880f5a399fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff880f5a398000 00000000000125c0 ffff880f5a399fd8 00000000000125c0
Call Trace:
[<ffffffff81481b71>] ? __mutex_lock_slowpath+0xd1/0x140
[<ffffffff814818d3>] ? mutex_lock+0x23/0x40
[<ffffffffa0937d95>] ? ocfs2_wipe_inode+0x105/0x690 [ocfs2]
[<ffffffffa0935cfb>] ? ocfs2_query_inode_wipe.clone.9+0xcb/0x370 [ocfs2]
[<ffffffffa09385a4>] ? ocfs2_delete_inode+0x284/0x3f0 [ocfs2]
[<ffffffffa0919a10>] ? ocfs2_dentry_attach_lock+0x5a0/0x5a0 [ocfs2]
[<ffffffffa093872e>] ? ocfs2_evict_inode+0x1e/0x50 [ocfs2]
[<ffffffff81145900>] ? evict+0x70/0x140
[<ffffffffa0919322>] ? __ocfs2_drop_dl_inodes.clone.2+0x32/0x60 [ocfs2]
[<ffffffffa0919a39>] ? ocfs2_drop_dl_inodes+0x29/0x90 [ocfs2]
[<ffffffff8106e56f>] ? process_one_work+0x11f/0x440
[<ffffffff8106f279>] ? worker_thread+0x159/0x330
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff81073fa6>] ? kthread+0x96/0xa0
[<ffffffff8148bb24>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81073f10>] ? kthread_worker_fn+0x1a0/0x1a0
[<ffffffff8148bb20>] ? gs_change+0x13/0x13
INFO: task rm:5192 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff88107f2725c0     0  5192  16338 0x00000000
ffff881014ccb040 0000000000000082 ffff8810206b8040 00000000000125c0
ffff8804d7697fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff8804d7696000 00000000000125c0 ffff8804d7697fd8 00000000000125c0
Call Trace:
[<ffffffff8148148d>] ? schedule_timeout+0x1ed/0x2e0
[<ffffffffa0886162>] ? dlmconvert_master+0xe2/0x190 [ocfs2_dlm]
[<ffffffffa08878bf>] ? dlmlock+0x7f/0xb70 [ocfs2_dlm]
[<ffffffff81480e0a>] ? wait_for_common+0x13a/0x190
[<ffffffff8104bc50>] ? try_to_wake_up+0x280/0x280
[<ffffffffa0928a38>] ? __ocfs2_cluster_lock.clone.21+0x1d8/0x6b0 [ocfs2]
[<ffffffffa0928fcc>] ? ocfs2_inode_lock_full_nested+0xbc/0x490 [ocfs2]
[<ffffffffa0943c1b>] ? ocfs2_lookup_lock_orphan_dir+0x6b/0x1b0 [ocfs2]
[<ffffffffa09454ba>] ? ocfs2_prepare_orphan_dir+0x4a/0x280 [ocfs2]
[<ffffffffa094616f>] ? ocfs2_unlink+0x6ef/0xb90 [ocfs2]
[<ffffffff811b35a9>] ? may_link.clone.22+0xd9/0x170
[<ffffffff8113aa58>] ? vfs_unlink+0x98/0x100
[<ffffffff8113ac41>] ? do_unlinkat+0x181/0x1b0
[<ffffffff8113e7cd>] ? vfs_readdir+0x9d/0xe0
[<ffffffff811653d8>] ? fsnotify_find_inode_mark+0x28/0x40
[<ffffffff81166324>] ? dnotify_flush+0x54/0x110
[<ffffffff8112b07f>] ? filp_close+0x5f/0x90
[<ffffffff8148aa12>] ? system_call_fastpath+0x16/0x1b
INFO: task kworker/u:2:20246 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff88107f4525c0     0 20246      2 0x00000000
ffff880b730b57d0 0000000000000046 ffff8810201297d0 00000000000125c0
ffff880f5a399fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff880f5a398000 00000000000125c0 ffff880f5a399fd8 00000000000125c0
Call Trace:
[<ffffffff81481b71>] ? __mutex_lock_slowpath+0xd1/0x140
[<ffffffff814818d3>] ? mutex_lock+0x23/0x40
[<ffffffffa0937d95>] ? ocfs2_wipe_inode+0x105/0x690 [ocfs2]
[<ffffffffa0935cfb>] ? ocfs2_query_inode_wipe.clone.9+0xcb/0x370 [ocfs2]
[<ffffffffa09385a4>] ? ocfs2_delete_inode+0x284/0x3f0 [ocfs2]
[<ffffffffa0919a10>] ? ocfs2_dentry_attach_lock+0x5a0/0x5a0 [ocfs2]
[<ffffffffa093872e>] ? ocfs2_evict_inode+0x1e/0x50 [ocfs2]
[<ffffffff81145900>] ? evict+0x70/0x140
[<ffffffffa0919322>] ? __ocfs2_drop_dl_inodes.clone.2+0x32/0x60 [ocfs2]
[<ffffffffa0919a39>] ? ocfs2_drop_dl_inodes+0x29/0x90 [ocfs2]
[<ffffffff8106e56f>] ? process_one_work+0x11f/0x440
[<ffffffff8106f279>] ? worker_thread+0x159/0x330
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff81073fa6>] ? kthread+0x96/0xa0
[<ffffffff8148bb24>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81073f10>] ? kthread_worker_fn+0x1a0/0x1a0
[<ffffffff8148bb20>] ? gs_change+0x13/0x13
INFO: task rm:5192 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff88107f2725c0     0  5192  16338 0x00000000
ffff881014ccb040 0000000000000082 ffff8810206b8040 00000000000125c0
ffff8804d7697fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff8804d7696000 00000000000125c0 ffff8804d7697fd8 00000000000125c0
Call Trace:
[<ffffffff8148148d>] ? schedule_timeout+0x1ed/0x2e0
[<ffffffffa0886162>] ? dlmconvert_master+0xe2/0x190 [ocfs2_dlm]
[<ffffffffa08878bf>] ? dlmlock+0x7f/0xb70 [ocfs2_dlm]
[<ffffffff81480e0a>] ? wait_for_common+0x13a/0x190
[<ffffffff8104bc50>] ? try_to_wake_up+0x280/0x280
[<ffffffffa0928a38>] ? __ocfs2_cluster_lock.clone.21+0x1d8/0x6b0 [ocfs2]
[<ffffffffa0928fcc>] ? ocfs2_inode_lock_full_nested+0xbc/0x490 [ocfs2]
[<ffffffffa0943c1b>] ? ocfs2_lookup_lock_orphan_dir+0x6b/0x1b0 [ocfs2]
[<ffffffffa09454ba>] ? ocfs2_prepare_orphan_dir+0x4a/0x280 [ocfs2]
[<ffffffffa094616f>] ? ocfs2_unlink+0x6ef/0xb90 [ocfs2]
[<ffffffff811b35a9>] ? may_link.clone.22+0xd9/0x170
[<ffffffff8113aa58>] ? vfs_unlink+0x98/0x100
[<ffffffff8113ac41>] ? do_unlinkat+0x181/0x1b0
[<ffffffff8113e7cd>] ? vfs_readdir+0x9d/0xe0
[<ffffffff811653d8>] ? fsnotify_find_inode_mark+0x28/0x40
[<ffffffff81166324>] ? dnotify_flush+0x54/0x110
[<ffffffff8112b07f>] ? filp_close+0x5f/0x90
[<ffffffff8148aa12>] ? system_call_fastpath+0x16/0x1b
INFO: task kworker/u:2:20246 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff88107f4525c0     0 20246      2 0x00000000
ffff880b730b57d0 0000000000000046 ffff8810201297d0 00000000000125c0
ffff880f5a399fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff880f5a398000 00000000000125c0 ffff880f5a399fd8 00000000000125c0
Call Trace:
[<ffffffff81481b71>] ? __mutex_lock_slowpath+0xd1/0x140
[<ffffffff814818d3>] ? mutex_lock+0x23/0x40
[<ffffffffa0937d95>] ? ocfs2_wipe_inode+0x105/0x690 [ocfs2]
[<ffffffffa0935cfb>] ? ocfs2_query_inode_wipe.clone.9+0xcb/0x370 [ocfs2]
[<ffffffffa09385a4>] ? ocfs2_delete_inode+0x284/0x3f0 [ocfs2]
[<ffffffffa0919a10>] ? ocfs2_dentry_attach_lock+0x5a0/0x5a0 [ocfs2]
[<ffffffffa093872e>] ? ocfs2_evict_inode+0x1e/0x50 [ocfs2]
[<ffffffff81145900>] ? evict+0x70/0x140
[<ffffffffa0919322>] ? __ocfs2_drop_dl_inodes.clone.2+0x32/0x60 [ocfs2]
[<ffffffffa0919a39>] ? ocfs2_drop_dl_inodes+0x29/0x90 [ocfs2]
[<ffffffff8106e56f>] ? process_one_work+0x11f/0x440
[<ffffffff8106f279>] ? worker_thread+0x159/0x330
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff81073fa6>] ? kthread+0x96/0xa0
[<ffffffff8148bb24>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81073f10>] ? kthread_worker_fn+0x1a0/0x1a0
[<ffffffff8148bb20>] ? gs_change+0x13/0x13
INFO: task rm:5192 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff88107f2725c0     0  5192  16338 0x00000000
ffff881014ccb040 0000000000000082 ffff8810206b8040 00000000000125c0
ffff8804d7697fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff8804d7696000 00000000000125c0 ffff8804d7697fd8 00000000000125c0
Call Trace:
[<ffffffff8148148d>] ? schedule_timeout+0x1ed/0x2e0
[<ffffffffa0886162>] ? dlmconvert_master+0xe2/0x190 [ocfs2_dlm]
[<ffffffffa08878bf>] ? dlmlock+0x7f/0xb70 [ocfs2_dlm]
[<ffffffff81480e0a>] ? wait_for_common+0x13a/0x190
[<ffffffff8104bc50>] ? try_to_wake_up+0x280/0x280
[<ffffffffa0928a38>] ? __ocfs2_cluster_lock.clone.21+0x1d8/0x6b0 [ocfs2]
[<ffffffffa0928fcc>] ? ocfs2_inode_lock_full_nested+0xbc/0x490 [ocfs2]
[<ffffffffa0943c1b>] ? ocfs2_lookup_lock_orphan_dir+0x6b/0x1b0 [ocfs2]
[<ffffffffa09454ba>] ? ocfs2_prepare_orphan_dir+0x4a/0x280 [ocfs2]
[<ffffffffa094616f>] ? ocfs2_unlink+0x6ef/0xb90 [ocfs2]
[<ffffffff811b35a9>] ? may_link.clone.22+0xd9/0x170
[<ffffffff8113aa58>] ? vfs_unlink+0x98/0x100
[<ffffffff8113ac41>] ? do_unlinkat+0x181/0x1b0
[<ffffffff8113e7cd>] ? vfs_readdir+0x9d/0xe0
[<ffffffff811653d8>] ? fsnotify_find_inode_mark+0x28/0x40
[<ffffffff81166324>] ? dnotify_flush+0x54/0x110
[<ffffffff8112b07f>] ? filp_close+0x5f/0x90
[<ffffffff8148aa12>] ? system_call_fastpath+0x16/0x1b
INFO: task kworker/u:2:20246 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff88107f4525c0     0 20246      2 0x00000000
ffff880b730b57d0 0000000000000046 ffff8810201297d0 00000000000125c0
ffff880f5a399fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff880f5a398000 00000000000125c0 ffff880f5a399fd8 00000000000125c0
Call Trace:
[<ffffffff81481b71>] ? __mutex_lock_slowpath+0xd1/0x140
[<ffffffff814818d3>] ? mutex_lock+0x23/0x40
[<ffffffffa0937d95>] ? ocfs2_wipe_inode+0x105/0x690 [ocfs2]
[<ffffffffa0935cfb>] ? ocfs2_query_inode_wipe.clone.9+0xcb/0x370 [ocfs2]
[<ffffffffa09385a4>] ? ocfs2_delete_inode+0x284/0x3f0 [ocfs2]
[<ffffffffa0919a10>] ? ocfs2_dentry_attach_lock+0x5a0/0x5a0 [ocfs2]
[<ffffffffa093872e>] ? ocfs2_evict_inode+0x1e/0x50 [ocfs2]
[<ffffffff81145900>] ? evict+0x70/0x140
[<ffffffffa0919322>] ? __ocfs2_drop_dl_inodes.clone.2+0x32/0x60 [ocfs2]
[<ffffffffa0919a39>] ? ocfs2_drop_dl_inodes+0x29/0x90 [ocfs2]
[<ffffffff8106e56f>] ? process_one_work+0x11f/0x440
[<ffffffff8106f279>] ? worker_thread+0x159/0x330
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff81073fa6>] ? kthread+0x96/0xa0
[<ffffffff8148bb24>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81073f10>] ? kthread_worker_fn+0x1a0/0x1a0
[<ffffffff8148bb20>] ? gs_change+0x13/0x13
INFO: task rm:5192 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff88107f2725c0     0  5192  16338 0x00000000
ffff881014ccb040 0000000000000082 ffff8810206b8040 00000000000125c0
ffff8804d7697fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff8804d7696000 00000000000125c0 ffff8804d7697fd8 00000000000125c0
Call Trace:
[<ffffffff8148148d>] ? schedule_timeout+0x1ed/0x2e0
[<ffffffffa0886162>] ? dlmconvert_master+0xe2/0x190 [ocfs2_dlm]
[<ffffffffa08878bf>] ? dlmlock+0x7f/0xb70 [ocfs2_dlm]
[<ffffffff81480e0a>] ? wait_for_common+0x13a/0x190
[<ffffffff8104bc50>] ? try_to_wake_up+0x280/0x280
[<ffffffffa0928a38>] ? __ocfs2_cluster_lock.clone.21+0x1d8/0x6b0 [ocfs2]
[<ffffffffa0928fcc>] ? ocfs2_inode_lock_full_nested+0xbc/0x490 [ocfs2]
[<ffffffffa0943c1b>] ? ocfs2_lookup_lock_orphan_dir+0x6b/0x1b0 [ocfs2]
[<ffffffffa09454ba>] ? ocfs2_prepare_orphan_dir+0x4a/0x280 [ocfs2]
[<ffffffffa094616f>] ? ocfs2_unlink+0x6ef/0xb90 [ocfs2]
[<ffffffff811b35a9>] ? may_link.clone.22+0xd9/0x170
[<ffffffff8113aa58>] ? vfs_unlink+0x98/0x100
[<ffffffff8113ac41>] ? do_unlinkat+0x181/0x1b0
[<ffffffff8113e7cd>] ? vfs_readdir+0x9d/0xe0
[<ffffffff811653d8>] ? fsnotify_find_inode_mark+0x28/0x40
[<ffffffff81166324>] ? dnotify_flush+0x54/0x110
[<ffffffff8112b07f>] ? filp_close+0x5f/0x90
[<ffffffff8148aa12>] ? system_call_fastpath+0x16/0x1b
INFO: task kworker/u:2:20246 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u:2     D ffff88107f4525c0     0 20246      2 0x00000000
ffff880b730b57d0 0000000000000046 ffff8810201297d0 00000000000125c0
ffff880f5a399fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff880f5a398000 00000000000125c0 ffff880f5a399fd8 00000000000125c0
Call Trace:
[<ffffffff81481b71>] ? __mutex_lock_slowpath+0xd1/0x140
[<ffffffff814818d3>] ? mutex_lock+0x23/0x40
[<ffffffffa0937d95>] ? ocfs2_wipe_inode+0x105/0x690 [ocfs2]
[<ffffffffa0935cfb>] ? ocfs2_query_inode_wipe.clone.9+0xcb/0x370 [ocfs2]
[<ffffffffa09385a4>] ? ocfs2_delete_inode+0x284/0x3f0 [ocfs2]
[<ffffffffa0919a10>] ? ocfs2_dentry_attach_lock+0x5a0/0x5a0 [ocfs2]
[<ffffffffa093872e>] ? ocfs2_evict_inode+0x1e/0x50 [ocfs2]
[<ffffffff81145900>] ? evict+0x70/0x140
[<ffffffffa0919322>] ? __ocfs2_drop_dl_inodes.clone.2+0x32/0x60 [ocfs2]
[<ffffffffa0919a39>] ? ocfs2_drop_dl_inodes+0x29/0x90 [ocfs2]
[<ffffffff8106e56f>] ? process_one_work+0x11f/0x440
[<ffffffff8106f279>] ? worker_thread+0x159/0x330
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff8106f120>] ? manage_workers.clone.21+0x120/0x120
[<ffffffff81073fa6>] ? kthread+0x96/0xa0
[<ffffffff8148bb24>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81073f10>] ? kthread_worker_fn+0x1a0/0x1a0
[<ffffffff8148bb20>] ? gs_change+0x13/0x13
INFO: task rm:5192 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rm              D ffff88107f2725c0     0  5192  16338 0x00000000
ffff881014ccb040 0000000000000082 ffff8810206b8040 00000000000125c0
ffff8804d7697fd8 00000000000125c0 00000000000125c0 00000000000125c0
ffff8804d7696000 00000000000125c0 ffff8804d7697fd8 00000000000125c0
Call Trace:
[<ffffffff8148148d>] ? schedule_timeout+0x1ed/0x2e0
[<ffffffffa0886162>] ? dlmconvert_master+0xe2/0x190 [ocfs2_dlm]
[<ffffffffa08878bf>] ? dlmlock+0x7f/0xb70 [ocfs2_dlm]
[<ffffffff81480e0a>] ? wait_for_common+0x13a/0x190
[<ffffffff8104bc50>] ? try_to_wake_up+0x280/0x280
[<ffffffffa0928a38>] ? __ocfs2_cluster_lock.clone.21+0x1d8/0x6b0 [ocfs2]
[<ffffffffa0928fcc>] ? ocfs2_inode_lock_full_nested+0xbc/0x490 [ocfs2]
[<ffffffffa0943c1b>] ? ocfs2_lookup_lock_orphan_dir+0x6b/0x1b0 [ocfs2]
[<ffffffffa09454ba>] ? ocfs2_prepare_orphan_dir+0x4a/0x280 [ocfs2]
[<ffffffffa094616f>] ? ocfs2_unlink+0x6ef/0xb90 [ocfs2]
[<ffffffff811b35a9>] ? may_link.clone.22+0xd9/0x170
[<ffffffff8113aa58>] ? vfs_unlink+0x98/0x100
[<ffffffff8113ac41>] ? do_unlinkat+0x181/0x1b0
[<ffffffff8113e7cd>] ? vfs_readdir+0x9d/0xe0
[<ffffffff811653d8>] ? fsnotify_find_inode_mark+0x28/0x40
[<ffffffff81166324>] ? dnotify_flush+0x54/0x110
[<ffffffff8112b07f>] ? filp_close+0x5f/0x90
[<ffffffff8148aa12>] ? system_call_fastpath+0x16/0x1b




More information about the Ocfs2-users mailing list