[Ocfs2-devel] ocfs2 cannot continue when JBD2 has aborted the journal,

Zhangguanghui zhang.guanghui at h3c.com
Thu Dec 17 20:58:39 PST 2015


Hi  Joseph<mailto:joseph.qi at huawei.com>

The following locking order can cause a deadlock.
 Node  A                                                                                                           Node B                                                              Node C
     Super lock  EX
        ocfs2_commit_thread
             ocfs2_commit_cache
              jbd2_journal_flush  while  journal is aborted , have been -EIO error.
     do not wake_up(&osb->dc_event)
     do not  downconvert EX->NL

while Node B required EX lock or PR lock, may cause nodes hung.
So reset Node A,  Node B and Node C will be normal.
Thanks a lot
________________________________
zhangguanghui

From: Joseph Qi<mailto:joseph.qi at huawei.com>
Date: 2015-12-18 09:05
To: zhangguanghui 10102 (CCPL)<mailto:zhang.guanghui at h3c.com>
CC: ocfs2-devel at oss.oracle.com<mailto:ocfs2-devel at oss.oracle.com>
Subject: Re: [Ocfs2-devel] ocfs2 cannot continue when JBD2 has aborted the journal,

Hi Guanghui,
Could you please describe the problem you encountered more specifically?
I don't think this change is in a fair way.

On 2015/12/17 13:33, Zhangguanghui wrote:
> Hi all,
>
> A tiny race about JBD2 has aborted to jbd2_journal_flush,
>
> because of unstable storage link and I/O stress.
>
> while JBD2 state is aborted, have been -EIO error,
>
> may cause all cluster nodes hung. so I thinks
>
> JBD2 has aborted the journal, ocfs2 cannot continue and trigger ocfs2_abort.
>
> Thanks, Any ideas about this patch?
>
>
> description:
>
> ocfs2_commit_thread
>   ocfs2_commit_cache
>     jbd2_journal_flush
>
>
> --- journal.c 2015-12-17 11:36:39.140542941 +0800
> +++ journal.c.diff 2015-12-17 11:39:21.308542922 +0800
> @@ -328,6 +328,9 @@
> if (status < 0) {
> up_write(&journal->j_trans_barrier);
> mlog_errno(status);
> + if (is_journal_aborted(journal)) {
> + ocfs2_abort(osb->sb, "Detect aborted journal,while committing cache.");
> + }
> goto finally;
> }
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!
---
> zhangguanghui
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20151218/daea7981/attachment-0001.html 


More information about the Ocfs2-devel mailing list