[Ocfs2-devel] [PATCH] ocfs2: limit printk when journal is aborted
Joseph Qi
joseph.qi at huawei.com
Mon Apr 21 18:08:22 PDT 2014
On 2014/4/22 4:51, Mark Fasheh wrote:
> On Mon, Apr 21, 2014 at 12:18:24PM -0700, Andrew Morton wrote:
>> On Fri, 18 Apr 2014 17:18:27 +0800 Joseph Qi <joseph.qi at huawei.com> wrote:
>>
>>>>>>> + if (printk_timed_ratelimit(&abort_warn_time, 60*HZ))
>>>>>>> + mlog(ML_ERROR, "status = %d, journal is "
>>>>>>> + "already aborted.\n", status);
>>>>>>> + msleep_interruptible(1000);
>>>>>>> + }
>>>>>>
>>>>>> Why the msleep? ocfs2_commit_thread will wait on the checkpoint_event queue
>>>>>> right after this anyway - is there a problem with it waiting on that?
>>>>>>
>>>>> Since jbd2 is already aborted, commit cache is meaningless.
>>>>
>>>> I understand that, but I'm asking why the msleep and whether we can avoid
>>>> that. To go back to my question:
>>>>
>>>> "ocfs2_commit_thread will wait on the checkpoint_event queue right after
>>>> this anyway - is there a problem with it waiting on that?"
>>>>
>>>> Thanks,
>>>> --Mark
>>> Sorry for my obscure description.
>>> If ocfs2_commit_cache fails because of JBD2_ABORT, j_num_trans won't be cleared.
>>> Then the condition of checkpoint event still evaluates true, so it won't wait.
>>
>> If Mark didn't understand the reason for the msleep then nobody weill,
>> so we need to add a comment. This?
>>
>> --- a/fs/ocfs2/journal.c~ocfs2-limit-printk-when-journal-is-aborted-fix
>> +++ a/fs/ocfs2/journal.c
>> @@ -2193,6 +2193,11 @@ static int ocfs2_commit_thread(void *arg
>> if (printk_timed_ratelimit(&abort_warn_time, 60*HZ))
>> mlog(ML_ERROR, "status = %d, journal is "
>> "already aborted.\n", status);
>> + /*
>> + * After ocfs2_commit_cache() fails, j_num_trans has a
>> + * non-zero value. Sleep here to avoid a busy-wait
>> + * loop.
>> + */
>> msleep_interruptible(1000);
>> }
>>
>>
>> This patch seems rather hacky :( Isn't there a better solution?
>
> Right, that's what I was getting at with my followup later on in the mail
> thread about this.
>
>
>> Why even keep the kernel thread running after an abort?
>
> The msleep is papering over the real issue. Either the thread should shut
> down or we need to re-evaluate usage of j_num_trans which is the condition
> that keeps it from sleeping (and from a quick glance it doesn't seem like
> j_num_trans does anything for us).
> --Mark
>
AFAIK, the commit thread ends only if dismounting volume. Journal abort
is different from journal shutdown, that's why leaves it running.
> --
> Mark Fasheh
>
> .
>
More information about the Ocfs2-devel
mailing list