[Ocfs2-devel] discuss about jbd2 assertion in defragment path

Heming Zhao heming.zhao at suse.com
Tue Feb 14 04:33:44 UTC 2023


On Tue, Feb 14, 2023 at 10:52:30AM +0800, Joseph Qi wrote:
> Hi, Sorry about the late reply.
> This thread is indeed a long time ago:(
> It seems that I said the two ocfs2_journal_access_di() are for different
> buffer head. Anyway, I have to recall the discussion before and get back
> to you.
> 

If you belive from [a] to [b.1] doesn't need any journal protection, my patch
makes sense from theory.

>From code logic, if the defrag path (ocfs2_split_extent) doesn't call
jbd2_journal_restart(), current code is absolute correct. Increasing credits
can't help avoid jbd2 assert crash, but make more easily to trigger crash,
because it makes jbd2_journal_extend() more easily to return "status > 0". In my
view, the correct fix should delete the access/dirty pair and make sure the
ocfs2_split_extent() is journal self-service.

Different buffer head issue could be handled by rechecking ocfs2_split_extent(),
make sure all journal handle branches are correct. I already checked
ocfs2_split_extent, but I can't assure my eyes didn't miss any corner.

Thanks,
Heming

> 
> On 2/10/23 6:04 PM, Heming Zhao wrote:
> > Hello Joseph,
> > 
> > I am sorry to wake up a long time ago thread.
> > 
> > All mails of this thread (my patch is [1]):
> > [1] https://oss.oracle.com/pipermail/ocfs2-devel/2022-May/000101.html
> > [2] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000105.html
> > [3] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000109.html
> > [4] https://oss.oracle.com/pipermail/ocfs2-devel/2022-June/000217.html
> > 
> > I re-checked ocfs2 defragmentation & jbd2 flow recently, I still think my
> > patch [1] is right. At least, the fixing code is correct, the patch commit
> > log needs to polish.
> > 
> > This bug has the same root cause of commit 7f27ec978b0e ("ocfs2: call
> > ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()").
> > For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() during
> > defragmenting, and it's not about "not enough credits" issue you ever said in [2].
> > 
> > I explain my thinking again in this mail.
> > 
> > the crash call flow:
> > 
> > ocfs2_defrag_extent //caller call it in while() loop.
> >  + handle = ocfs2_start_trans(osb, credits)
> >  + __ocfs2_move_extent
> >  |  + ocfs2_journal_access_di //[a]
> >  |  + ocfs2_split_extent      //[b]
> >  |  |  + if                   //[b.1]
> >  |  |  |   ocfs2_replace_extent_rec/ocfs2_split_and_insert
> >  |  |  + else
> >  |  |     ocfs2_try_to_merge_extent
> >  |  |
> >  |  + ocfs2_journal_dirty     //[c]
> >  |
> >  + ocfs2_commit_trans(osb, handle) //<== complete this handle
> > 
> > In my viewpoint, ocfs2_split_extent() is journal self-service function. I still
> > belive the two lines ([a] & [c]) in __ocfs2_move_extent() are totally useless.
> > In ocfs2_split_extent(), the code from the first code line to "if-else" code
> > area ([b.1]) doesn't need any journal protection, and we also could see there
> > are only read operations.
> > 
> > If we worry about data corruption after removing [a] & [c], (e.g: my eyes missed
> > some journal operations from [a] to [b.1]), we could only delete [c]. So the
> > fixed code seems (only remove line [c]):
> > 
> > ocfs2_defrag_extent
> >  + handle = ocfs2_start_trans(osb, credits)
> >  + __ocfs2_move_extent
> >  |  + ocfs2_journal_access_di //[a]  <-- keep it, but remove pair dirty action
> >  |  + ocfs2_split_extent      //[b]
> >  |     + if                   //[b.1]
> >  |     |   ocfs2_replace_extent_rec/ocfs2_split_and_insert
> >  |     + else
> >  |        ocfs2_try_to_merge_extent
> >  |
> >  + ocfs2_commit_trans(osb, handle)
> > 
> > Thanks,
> > Heming



More information about the Ocfs2-devel mailing list