[Ocfs2-devel] [PATCH 1/2] ocfs2: fix jbd2 assertion in defragment path
Heming Zhao
heming.zhao at suse.com
Sat Jun 4 00:08:28 UTC 2022
On Thu, Jun 02, 2022 at 05:34:18PM +0800, Joseph Qi wrote:
> Hi,
>
> Sorry for the late response since I was busy on other things...
>
> On 5/21/22 6:14 PM, Heming Zhao wrote:
> > defragfs triggered jbd2 report ASSERT when working.
> >
> > code path:
> >
> > ocfs2_ioctl_move_extents
> > ocfs2_move_extents
> > __ocfs2_move_extents_range
> > ocfs2_defrag_extent
> > __ocfs2_move_extent
> > + ocfs2_journal_access_di
> > + ocfs2_split_extent //[1]
> > + ocfs2_journal_dirty //crash
> >
> > [1]: ocfs2_split_extent called ocfs2_extend_trans, which committed
> > dirty buffers then restarted the transaction. The changed transaction
> > triggered jbd2 ASSERT.
> >
> > crash stacks:
> >
> > PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2"
> > #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01
> > #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d
> > #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d
> > #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f
> > #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205
> > #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6
> > #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18
> > [exception RIP: jbd2_journal_dirty_metadata+0x2ba]
> > RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207
> > RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000
> > RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250
> > RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000
> > R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28
> > R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250
> > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> > #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2]
> > #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2]
> > #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2]
> >
> > The fix method is simple, ocfs2_split_extent includes three paths. all
> > the paths already have journal access/dirty pair. We only need to
> > remove journal access/dirty from __ocfs2_move_extent.
> >
>
> I am not sure what you mean by "all three paths have journal access/
> dirty pair".
I am a newbie for ocfs2 & jbd2, I can't make sure this [1/2] patch is correct.
Below is my analysis.
All three paths (below 1.[123]):
__ocfs2_move_extent
+ ocfs2_journal_access_di
|
+ ocfs2_split_extent //[1]
| + ocfs2_replace_extent_rec //[1.1]
| + ocfs2_split_and_insert //[1.2]
| + ocfs2_try_to_merge_extent //[1.3]
|
+ + ocfs2_journal_dirty //crash
All three paths have itselves journal access/dirty pair.
So the access/dirty pair in caller __ocfs2_move_extent is unnecessary.
>
> It seems we can't do it in your way, as journal access has different
> ocfs2_trigger with different offset, e.g. dinode is different from
> extent block.
There are 3 ocfs2_split_extent callers:
fs/ocfs2/alloc.c <<ocfs2_change_extent_flag>>
fs/ocfs2/move_extents.c <<__ocfs2_move_extent>>
fs/ocfs2/refcounttree.c <<ocfs2_clear_ext_refcount>>
We can see, except defrag flow (caller __ocfs2_move_extent), other two
don't use jbd2 access/dirty pair around ocfs2_split_extent.
In my view, in defrag code flow, the struct ocfs2_triggers is changed many
times. The outermost ocfs2_triggers (belongs to __ocfs2_move_extent) must be
overwritten in sub-routine ocfs2_split_extent.
In another perspective, why ocfs2_change_extent_flag & ocfs2_clear_ext_refcount
don't use journal access/dirty pair to wrap ocfs2_split_extent?
>
> Or we may reserve enough credits to resolve this issue?
>
In my view, credits is not the key for this crash issue. The crash is
triggered by transaction changed:
crash code:
```jbd2_journal_dirty_metadata()
if (data_race(jh->b_transaction != transaction &&
jh->b_next_transaction != transaction)) {
spin_lock(&jh->b_state_lock);
J_ASSERT_JH(jh, jh->b_transaction == transaction ||
jh->b_next_transaction == transaction);
spin_unlock(&jh->b_state_lock);
}
```
why transaction is changed?
ocfs2_split_extent
ocfs2_split_and_insert
ocfs2_do_insert_extent
ocfs2_insert_path
ocfs2_extend_trans //change transaction, and pls note this func comments
Thanks,
Heming
More information about the Ocfs2-devel
mailing list