[Ocfs2-tools-devel] 答复: [Ocfs2-devel] An issue on OCFS2 fsck tool

Gechangwei ge.changwei at h3c.com
Tue Apr 12 20:46:25 PDT 2016


Hi Joseph,

Thanks for your reply!

According to my test result and code reviewing, it is quite sure that if it is necessary to replay journal, then fsck tool will try to close the OCFS2 in order to flush dirty data to disk and re-open the OCFS2 later, however, without attaching the DLM context object to the new generated file system object.

I agree with your opinion that O2HB does not directly relate to journal replaying, but during its procedure, it forgets the DLM context object resulting in a wild pointer!

In one word, at the end of fsck, due to the NULL value of "ost->ost_fs->fs_dlm_ctxt", no DLM will be shut down. So O2HB still heart beats.


diff --git a/fsck.ocfs2/fsck.c b/fsck.ocfs2/fsck.c
index 344ad78..291248f 100644
--- a/fsck.ocfs2/fsck.c
+++ b/fsck.ocfs2/fsck.c
@@ -485,6 +485,7 @@ static errcode_t maybe_replay_journals(o2fsck_state *ost, char *filename,
 {
        int replayed = 0, should = 0, has_dirty = 0;
        errcode_t ret = 0;
+    void *dlm_domain = NULL;

        ret = o2fsck_should_replay_journals(ost->ost_fs, &should, &has_dirty);
        if (ret)
@@ -517,7 +518,9 @@ static errcode_t maybe_replay_journals(o2fsck_state *ost, char *filename,
         * over */
        if (!replayed)
                goto out;
-
+
+    dlm_domain = ost->ost_fs->fs_dlm_ctxt;
+
        ret = ocfs2_close(ost->ost_fs);
        if (ret) {
                com_err(whoami, ret, "while closing \"%s\"", filename);
@@ -525,7 +528,14 @@ static errcode_t maybe_replay_journals(o2fsck_state *ost, char *filename,
        }

        ret = open_and_check(ost, filename, open_flags, blkno, blksize);
+
+    if(ost->ost_fs != NULL)
+    {
+        ost->ost_fs->fs_dlm_ctxt = dlm_domain;
+    }
+
 out:
+    dlm_domain = NULL;
        return ret;
 }

@@ -1094,14 +1104,16 @@ clear_dirty_flag:
 unlock:
        block_signals(SIG_BLOCK);
        if (ost->ost_fs->fs_dlm_ctxt)
-               ocfs2_release_cluster(ost->ost_fs);
+               if(ocfs2_release_cluster(ost->ost_fs))
+            fprintf(stderr, "release cluster failed.\n");
        cluster_locked = 0;
        block_signals(SIG_UNBLOCK);

 close:
        block_signals(SIG_BLOCK);
        if (ost->ost_fs->fs_dlm_ctxt)
-               ocfs2_shutdown_dlm(ost->ost_fs, whoami);
+               if(ocfs2_shutdown_dlm(ost->ost_fs, whoami))
+            fprintf(stderr, "shutdown dlm failed.\n");
        block_signals(SIG_UNBLOCK);

        ret = ocfs2_close(ost->ost_fs);


BR.

Gechangwei
H3C Technologies Co., Limited



-----邮件原件-----
发件人: Joseph Qi [mailto:joseph.qi at huawei.com] 
发送时间: 2016年4月13日 10:48
收件人: gechangwei 12382 (CCPL)
抄送: ocfs2-tools-devel; guozhonghua 02084 (CCPL)
主题: Re: [Ocfs2-devel] An issue on OCFS2 fsck tool

Switched to ocfs2-tools-devel.

o2hb thread is started to do safe check. I don't think it has much to do with journal replay. IMO, firstly we have to figure out whether it fails to stop o2hb thread or even it doesn't stop.

Thanks,
Joseph

On 2016/4/12 10:51, Gechangwei wrote:
> Hi OCFS2 experts,
> 
>  
> 
> I encountered an issue during checking an OCFS2 volume via OCFS2 fsck tool aka fsck.ocfs2. I am not sure if I can get some help from you.
> 
>  
> 
> I found that as long as there was some dirty data held by journal, fsck.ocfs2 tool would start an O2HB thread and tried to replay the journal.
> 
>  
> 
> What bothers me is that even the file system checking and recovering procedure is done; the O2HB thread is still active.
> 
>  
> 
> I think this is not reasonable. I doubt this was a BUG in fsck.ocfs2. Do you have any comments on this issue?
> 
>  
> 
> I am looking forward to getting a little help from you.
> 
>  
> 
> Many thanks.
> 
>  
> 
> Best regards.
> 
>  
> 
>  
> 
> Gechangwei
> 
> H3C Technologies Co., Limited
> 
>  
> 
>  
> 
> ----------------------------------------------------------------------
> ---------------------------------------------------------------
> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from 
> H3C, which is intended only for the person or entity whose address is 
> listed above. Any use of the information contained herein in any way 
> (including, but not limited to, total or partial disclosure, 
> reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, 
> please notify the sender by phone or email immediately and delete it!
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 




More information about the Ocfs2-tools-devel mailing list