[Ocfs2-devel] discuss about jbd2 assertion in defragment path
Heming Zhao
heming.zhao at suse.com
Tue Feb 14 11:48:29 UTC 2023
On Tue, Feb 14, 2023 at 07:08:16PM +0800, Joseph Qi wrote:
> Hi,
>
> On 2/14/23 12:33 PM, Heming Zhao wrote:
> > On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote:
> >> Hi, Sorry about the late reply.
> >> This thread is indeed a long time ago:(
> >> It seems that I said the two ocfs2_journal_access_di() are for different
> >> buffer head. Anyway, I have to recall the discussion before and get back
> >> to you.
> >>
> > If you belive from [a] to [b.1] doesn't need any journal protection, my patch
> > makes sense from theory.
> >
> > From code logic, if the defrag path (ocfs2_split_extent) doesn't call
> > jbd2_journal_restart(), current code is absolute correct. Increasing credits
> > can't help avoid jbd2 assert crash, but make more easily to trigger crash,
> > because it makes jbd2_journal_extend() more easily to return "status > 0". In my
> > view, the correct fix should delete the access/dirty pair and make sure the
> > ocfs2_split_extent() is journal self-service.
> >
> > Different buffer head issue could be handled by rechecking ocfs2_split_extent(),
> > make sure all journal handle branches are correct. I already checked
> > ocfs2_split_extent, but I can't assure my eyes didn't miss any corner.
>
> ocfs2_split_extent are not just 'read' operations, it may grow the tree,
> that's why we have to do a transaction here.
You misunderstand my meaning, my mean: from the caller __ocfs2_move_extent()
calling ocfs2_journal_access_di() (below [a]) to (before) the handling
ctxt.c_contig_type (below [b.1]) in ocfs2_split_extent, this area is purely read
operations. It's the range [ [a], [b.1]) (include [a], not include [b.1])
__ocfs2_move_extent
+ ocfs2_journal_access_di //[a] <------+
+ ocfs2_split_extent //[b] | [a, b.1) area is read operations
| + if //[b.1] <------+
| | A
| | +---- from this line, this func starts r/w IOs & needs jbd2 operations
| |
| | ocfs2_replace_extent_rec()/ocfs2_split_and_insert()
| + else
| ocfs2_try_to_merge_extent ()
|
+ ocfs2_journal_dirty //[c]
>
> The whole code flow:
> ocfs2_ioctl_move_extents
> ocfs2_move_extents
> __ocfs2_move_extents_range
> ocfs2_defrag_extent [1]
> ocfs2_start_trans [a]
> __ocfs2_move_extent
> ocfs2_journal_access_di
> ocfs2_split_extent
> ocfs2_journal_dirty
> ocfs2_commit_trans
> ocfs2_move_extent [2]
> ocfs2_start_trans
> __ocfs2_move_extent
> ocfs2_journal_access_di
> ocfs2_split_extent
> ocfs2_journal_dirty
> ocfs2_commit_trans
> ocfs2_start_trans [b]
> ocfs2_journal_access_di
> ocfs2_journal_dirty
> ocfs2_commit_trans
>
> In above, [a] and [b] are different transaction start/commit pair, and
> each allocates different journal credits, even they are operate the same
> bh, they are still different operations.
I agree above writing, but it can't change my conclusion: ocfs2_split_extent()
is journal self-service.
It seems the author meaning: he wanted [a] to protect defragging actions (extents
changes), wanted [b] to protect inode ctime changes. the
ocfs2_start_trans/ocfs2_commit_trans in [a] area is necessary. but the
access/dirty pair in __ocfs2_move_extent() is potential crash risk.
>
> My another question is, __ocfs2_move_extent will be called by both
> ocfs2_defrag_extent and ocfs2_move_extent, why this doesn't happen on the
> other path?
The reason is userspace defragfs.ocfs2 never trigger ocfs2_move_extent().
see do_file_defrag():
struct ocfs2_move_extents me = {
.me_start = 0,
.me_len = buf->st_size,
.me_flags = OCFS2_MOVE_EXT_FL_AUTO_DEFRAG, //<-- It makes do_defrag becomes 1
//in kernel space __ocfs2_move_extents_range()
};
Thanks,
Heming
More information about the Ocfs2-devel
mailing list