[Ocfs2-devel] [PATCH] ocfs2: fix defrag path triggering jbd2 ASSERT
Joseph Qi
joseph.qi at linux.alibaba.com
Sun Feb 19 11:12:50 UTC 2023
On 2/17/23 8:37 AM, Heming Zhao wrote:
> code path:
>
> ocfs2_ioctl_move_extents
> ocfs2_move_extents
> ocfs2_defrag_extent
> __ocfs2_move_extent
> + ocfs2_journal_access_di
> + ocfs2_split_extent //sub-paths call jbd2_journal_restart
> + ocfs2_journal_dirty //crash by jbs2 ASSERT
>
> crash stacks:
>
> PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2"
> #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01
> #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d
> #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d
> #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f
> #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205
> #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6
> #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18
> [exception RIP: jbd2_journal_dirty_metadata+0x2ba]
> RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207
> RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250
> RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000
> R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28
> R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2]
> #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2]
> #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2]
>
> Analysis
>
> This bug has the same root cause of 'commit 7f27ec978b0e ("ocfs2: call
> ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()")'.
> For this bug, jbd2_journal_restart() is called by ocfs2_split_extent()
> during defragmenting.
>
> How to fix
>
> For ocfs2_split_extent() can handle journal operations totally by itself.
> Caller doesn't need to call journal access/dirty pair, and caller only
> needs to call journal start/stop pair. The fix method is to remove journal
> access/dirty from __ocfs2_move_extent().
>
> The discussion for this patch:
> https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html
>
> Signed-off-by: Heming Zhao <heming.zhao at suse.com>
Reviewed-by: Joseph Qi <joseph.qi at linux.alibaba.com>
> ---
> v1 -> v2:
> - doesn't change any code.
> - change patch subject from "ocfs2: fix J_ASSERT_JH in defragment path"
> to "ocfs2: fix defrag path triggering jbd2 ASSERT"
> - rewrite/polish commit log
>
> v1: https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html
>
> ---
> fs/ocfs2/move_extents.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
> index 192cad0662d8..6251748c695b 100644
> --- a/fs/ocfs2/move_extents.c
> +++ b/fs/ocfs2/move_extents.c
> @@ -105,14 +105,6 @@ static int __ocfs2_move_extent(handle_t *handle,
> */
> replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
>
> - ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode),
> - context->et.et_root_bh,
> - OCFS2_JOURNAL_ACCESS_WRITE);
> - if (ret) {
> - mlog_errno(ret);
> - goto out;
> - }
> -
> ret = ocfs2_split_extent(handle, &context->et, path, index,
> &replace_rec, context->meta_ac,
> &context->dealloc);
> @@ -121,8 +113,6 @@ static int __ocfs2_move_extent(handle_t *handle,
> goto out;
> }
>
> - ocfs2_journal_dirty(handle, context->et.et_root_bh);
> -
> context->new_phys_cpos = new_p_cpos;
>
> /*
More information about the Ocfs2-devel
mailing list