[Ocfs2-devel] [PATCH v2] ocfs2/journal: fix umount hang after flushing journal failure

Joseph Qi jiangqi903 at gmail.com
Tue Jan 10 19:21:09 PST 2017


The patch is malformed...

BTW, I'd like ML_ERROR level log rather ML_KTHREAD, because journal

abort at least should alert filesystem administrator, also this will discard

some transactions.

So how about:

if (status < 0) {

     mlog(ML_ERROR, "journal is already abort and cannot be "

             "flushed any more. So ignore the pending "

             " transactions to avoid blocking ocfs2 unmount.\n");

     atomic_set(&journal->j_num_trans, 0);

}

Thanks,
Joseph

On 17/1/11 10:16, Gechangwei wrote:
> Hi,
>
> As prior e-mail described, umount would hang after journal flushing failure.
>
> When journal flushing in ocfs2_commit_cache() fails, following umount
> procedure at stage of shutting journal may be blocked due to non-zero
> transactions number.
>
> Once jbd2_journal_flush() fails, the journal will be marked as ABORT
> state within JBD2. There is no way to come back afterwards.
>
> So there is no chance to set transactions number to zero, thus, shutting
> journal may be blocked .
>
> ocfs2_commit_thread()
>
>      ocfs2_commit_cache()
>
>          jbd2_journal_flush() -> failure takes journal into ABORT state,
> thus, transaction number will never set to zero
>
>
>
> The back trace is cited blow:
>
> [<ffffffff8109de0c>] kthread_stop+0x4c/0x150
> [<ffffffffc056a887>] ocfs2_journal_shutdown+0xa7/0x400 [ocfs2]
> [<ffffffffc059fbde>] ocfs2_dismount_volume+0xbe/0x4a0 [ocfs2]
> [<ffffffffc059fff7>] ocfs2_put_super+0x37/0xb0 [ocfs2]
> [<ffffffff8120534e>] generic_shutdown_super+0x7e/0x110
> [<ffffffff81205410>] kill_block_super+0x30/0x80
> [<ffffffff81205669>] deactivate_locked_super+0x59/0x90
> [<ffffffff812062ee>] deactivate_super+0x4e/0x70
> [<ffffffff81222423>] cleanup_mnt+0x43/0x90
> [<ffffffff812224c2>] __cleanup_mnt+0x12/0x20
> [<ffffffff8109c247>] task_work_run+0xb7/0xf0
> [<ffffffff81016f7c>] do_notify_resume+0x8c/0xa0
> [<ffffffff817f8384>] int_signal+0x12/0x17
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> To solve this issue, I propose an improved version of patch referring to
> Andrew's and Joseph's  comments.
>
> Any more comments is welcomed as always.
>
>
>  From f3315307dd2a30da138383340689a81708134a44 Mon Sep 17 00:00:00 2001
> From: Changwei Ge <ge.changwei at h3c.com>
> Date: Wed, 11 Jan 2017 09:05:35 +0800
> Subject: [PATCH v2] fix umount hang after journal flushing failure
>
> Signed-off-by: Changwei Ge <ge.changwei at h3c.com>
> ---
>   fs/ocfs2/journal.c |   18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
>
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index a244f14..9a6b234 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -2315,6 +2315,24 @@ static int ocfs2_commit_thread(void *arg)
>                               "commit_thread: %u transactions pending on "
>                               "shutdown\n",
>                               atomic_read(&journal->j_num_trans));
> +
> +                       if (status < 0) {
> +                               mlog(ML_KTHREAD,
> +                                    "Although %u transactions left,
> journal must be shut down right now.\n",
> +                                    atomic_read(&journal->j_num_trans));
> +                               /*
> +                                * This may a litte hacky, however, no
> chance
> +                                * for ocfs2/journal to decrease this
> variable
> +                                * thourgh commit-thread. I have to do so to
> +                                * avoid umount hang after journal flushing
> +                                * failure. Since jounral has been
> marked ABORT
> +                                * within jbd2_journal_flush, commit
> cache will
> +                                * never do any real work to flush
> journal to
> +                                * disk.Set t to ZERO so that umount will
> +                                * continue during shutting downjournal
> +                                */
> +                               atomic_set(&journal->j_num_trans, 0);
> +                       }
>                  }
>          }
>
> --
> 1.7.9.5
>
>
> Thanks.
>
> Br.
>
> Changwei
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!




More information about the Ocfs2-devel mailing list