[Ocfs2-devel] ocfs2: Question for ocfs2_recovery_thread
Sunil Mushran
sunil.mushran at gmail.com
Wed May 22 16:00:12 PDT 2013
True. The function could do with a little bit of cleanup. Feel free to send
a patch.
On Sun, May 19, 2013 at 7:49 PM, Joseph Qi <joseph.qi at huawei.com> wrote:
> On 2013/5/19 10:25, Joseph Qi wrote:
> > On 2013/5/18 21:26, Sunil Mushran wrote:
> >> The first node that gets the lock will do the actual recovery. The
> others will get the lock and see a clean journal and skip the recovery. A
> thread should never error out if it fails to get the lock. It should try
> and try again.
> >>
> >> On May 17, 2013, at 11:27 PM, Joseph Qi <joseph.qi at huawei.com> wrote:
> >>
> >>> Hi,
> >>> Once there is node down in the cluster, ocfs2_recovery_thread will be
> >>> triggered on each node. These threads then do the down node recovery by
> >>> get super lock.
> >>> I have several questions on this:
> >>> 1) Why each node has to run such a thread? We know at last one node can
> >>> get the super lock and do the actual recovery.
> >>> 2) If this thread is running but something error occurred, take
> >>> ocfs2_super_lock failed for example, the thread will exit without
> >>> clearing recovery map, will it cause other threads still waiting for
> >>> recovery in ocfs2_wait_for_recovery?
> >>>
> >>
> >>
> > But when error occurs and goes to bail, and the restart logic will not
> > run. Codes like below:
> > ...
> > status = ocfs2_wait_on_mount(osb);
> > if (status < 0) {
> > goto bail;
> > }
> >
> > rm_quota = kzalloc(osb->max_slots * sizeof(int), GFP_NOFS);
> > if (!rm_quota) {
> > status = -ENOMEM;
> > goto bail;
> > }
> > restart:
> > status = ocfs2_super_lock(osb, 1);
> > if (status < 0) {
> > mlog_errno(status);
> > goto bail;
> > }
> > ...
> > if (!status && !ocfs2_recovery_completed(osb)) {
> > mutex_unlock(&osb->recovery_lock);
> > goto restart;
> > }
> >
> >
> > _______________________________________________
> > Ocfs2-devel mailing list
> > Ocfs2-devel at oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >
> >
> One more question, do we make sure dlm_recovery_thread always prior to
> ocfs2_recovery_thread?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20130522/f0a5a382/attachment-0001.html
More information about the Ocfs2-devel
mailing list