<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Jul 6, 2021, at 7:11 PM, Joseph Qi &lt;<a href="mailto:jiangqi903@gmail.com" class="">jiangqi903@gmail.com</a>&gt; wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi Gang,<br class="">
Could you please describe the issue in the following way?<br class="">
<br class="">
Node 1<span class="Apple-tab-span" style="white-space:pre"> </span><span class="Apple-tab-span" style="white-space:pre"></span>Node 2<span class="Apple-tab-span" style="white-space:pre">
</span><span class="Apple-tab-span" style="white-space:pre"></span>Node 3<br class="">
...<br class="">
<span class="Apple-tab-span" style="white-space:pre"></span><span class="Apple-tab-span" style="white-space:pre"></span>...<br class="">
<span class="Apple-tab-span" style="white-space:pre"></span><span class="Apple-tab-span" style="white-space:pre"></span><span class="Apple-tab-span" style="white-space:pre"></span><span class="Apple-tab-span" style="white-space:pre"></span>...<br class="">
<br class="">
That would be more clearly for discussing.<br class="">
<br class="">
Thanks,<br class="">
Joseph<br class="">
<br class="">
On 7/1/21 6:56 PM, Gang He wrote:<br class="">
<blockquote type="cite" class="">Hi Guys,<br class="">
<br class="">
There are three node ocfs2 cluster, when the user run reflink command during ocfs2 node does recovery(e.g. one node is fenced).<br class="">
<br class="">
The hang problem was caused dlm dead lock between rksaph18 and rksaph19, the detailed processes are as below,<br class="">
<br class="">
Jun 01 12:33:10 rksaph18 kernel: task:reflink &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;state:D stack: &nbsp;&nbsp;&nbsp;0 pid: 7515 ppid: &nbsp;7439 flags:0x00004000<br class="">
Jun 01 12:33:10 rksaph18 kernel: Call Trace:<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;__schedule+0x2fd/0x750<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;schedule+0x2f/0xa0<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;schedule_timeout+0x1cc/0x310<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? ocfs2_control_cfu+0x50/0x50 [ocfs2_stack_user]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? 0xffffffffc08f7000<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;wait_for_completion+0xba/0x140<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? wake_up_q+0xa0/0xa0<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;__ocfs2_cluster_lock.isra.41+0x3b5/0x820 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? ocfs2_inode_lock_full_nested+0x1fc/0x960 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ocfs2_inode_lock_full_nested+0x1fc/0x960 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ocfs2_mv_orphaned_inode_to_new+0x346/0x7e0 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? _raw_spin_unlock_irqrestore+0x14/0x20<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ocfs2_reflink+0x335/0x4c0 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;? ocfs2_reflink_ioctl+0x2ca/0x360 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ocfs2_reflink_ioctl+0x2ca/0x360 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ocfs2_ioctl+0x25e/0x670 [ocfs2]<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;do_vfs_ioctl+0xa0/0x680<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;ksys_ioctl+0x70/0x80<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;__x64_sys_ioctl+0x16/0x20<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;do_syscall_64+0x5b/0x1e0<br class="">
Jun 01 12:33:10 rksaph18 kernel: &nbsp;entry_SYSCALL_64_after_hwframe+0x44/0xa9<br class="">
Jun 01 12:33:10 rksaph18 kernel: RIP: 0033:0x7f1bb2aaf9e7<br class="">
<br class="">
The reflink process is geting orphan_dir_inode(ocfs2 system file per node) dlm lock after it has acquired the reflink target file inode dlm lock, the reflink target file is put in orphan_dir_inode directory during the reflink operation.<br class="">
<br class="">
Jun 01 12:33:17 rksaph19 kernel: Workqueue: ocfs2_wq ocfs2_complete_recovery [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: Call Trace:<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;__schedule+0x2fd/0x750<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;schedule+0x2f/0xa0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;schedule_timeout+0x1cc/0x310<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? ocfs2_control_cfu+0x50/0x50 [ocfs2_stack_user]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? 0xffffffffc0862000<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;wait_for_completion+0xba/0x140<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? wake_up_q+0xa0/0xa0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;__ocfs2_cluster_lock.isra.41+0x3b5/0x820 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? ocfs2_inode_lock_full_nested+0x1fc/0x960 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_inode_lock_full_nested+0x1fc/0x960 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_evict_inode+0x18a/0x810 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;evict+0xca/0x1b0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_orphan_filldir+0x92/0x140 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_dir_foreach_blk+0x4b2/0x570 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? ocfs2_inode_lock_full_nested+0x487/0x960 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_dir_foreach+0x54/0x80 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_queue_orphans+0xf2/0x1f0 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? __ocfs2_recovery_map_test+0x50/0x50 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? chacha_block_generic+0x6c/0xb0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? ocfs2_recover_orphans+0x12d/0x4f0 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_recover_orphans+0x12d/0x4f0 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? internal_add_timer+0x4e/0x70<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ocfs2_complete_recovery+0x19a/0x450 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? queue_delayed_work_on+0x2a/0x40<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? ocfs2_orphan_scan_work+0x110/0x2c0 [ocfs2]<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;process_one_work+0x1f4/0x3e0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;worker_thread+0x2d/0x3e0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? process_one_work+0x3e0/0x3e0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;kthread+0x10d/0x130<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;? kthread_park+0xa0/0xa0<br class="">
Jun 01 12:33:17 rksaph19 kernel: &nbsp;ret_from_fork+0x22/0x40<br class="">
</blockquote>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>The stack on rksaph19 doesn’t look reasonable. &nbsp;The iput() shouldn’t drop the last reference.</div>
<div>I am wondering if&nbsp;<span style="font-family: &quot;Helvetica Neue&quot;; font-size: 13px;" class="">f5785283dd64867a711ca1fb1f5bb172f252ecdf fixes your problem.</span>&nbsp;</div>
<div><br class="">
</div>
<div>thanks,</div>
<div>wengang</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">
<blockquote type="cite" class=""><br class="">
The ocfs2_complete_recovery routine is geting that orphan file inode dlm lock after it has acquired orphan_dir_inode dlm lock. ​<br class="">
Then, the hang looks like ABBA dead lock.<br class="">
<br class="">
So far, I cannot reproduce this race condition, but according to the backtraces and the related source code, this problem does exist.<br class="">
The triggering conditions are related to, the node is runing reflink command when
<br class="">
one node is fenced/down (the dlm recovery routine is triggered). <br class="">
<br class="">
Any comments?<br class="">
<br class="">
Thanks<br class="">
Gang<br class="">
_______________________________________________<br class="">
Ocfs2-devel mailing list<br class="">
<a href="mailto:Ocfs2-devel@oss.oracle.com" class="">Ocfs2-devel@oss.oracle.com</a><br class="">
https://oss.oracle.com/mailman/listinfo/ocfs2-devel<br class="">
<br class="">
</blockquote>
<br class="">
_______________________________________________<br class="">
Ocfs2-devel mailing list<br class="">
<a href="mailto:Ocfs2-devel@oss.oracle.com" class="">Ocfs2-devel@oss.oracle.com</a><br class="">
https://oss.oracle.com/mailman/listinfo/ocfs2-devel</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>