[Ocfs2-devel] ocfs2: Question for ocfs2_recovery_thread

Thu May 23 02:37:14 PDT 2013

On 2013/5/23 7:00, Sunil Mushran wrote:
> True. The function could do with a little bit of cleanup. Feel free to 
> send a patch.

  from ocfs2 code , I don't found that dlm_recovery_thread always prior to
ocfs2_recovery_thread? please tell me,thanks.
>
>
> On Sun, May 19, 2013 at 7:49 PM, Joseph Qi <joseph.qi at huawei.com 
> <mailto:joseph.qi at huawei.com>> wrote:
>
>     On 2013/5/19 10:25, Joseph Qi wrote:
>     > On 2013/5/18 21:26, Sunil Mushran wrote:
>     >> The first node that gets the lock will do the actual recovery.
>     The others will get the lock and see a clean journal and skip the
>     recovery. A thread should never error out if it fails to get the
>     lock. It should try and try again.
>     >>
>     >> On May 17, 2013, at 11:27 PM, Joseph Qi <joseph.qi at huawei.com
>     <mailto:joseph.qi at huawei.com>> wrote:
>     >>
>     >>> Hi,
>     >>> Once there is node down in the cluster, ocfs2_recovery_thread
>     will be
>     >>> triggered on each node. These threads then do the down node
>     recovery by
>     >>> get super lock.
>     >>> I have several questions on this:
>     >>> 1) Why each node has to run such a thread? We know at last one
>     node can
>     >>> get the super lock and do the actual recovery.
>     >>> 2) If this thread is running but something error occurred, take
>     >>> ocfs2_super_lock failed for example, the thread will exit without
>     >>> clearing recovery map, will it cause other threads still
>     waiting for
>     >>> recovery in ocfs2_wait_for_recovery?
>     >>>
>     >>
>     >>
>     > But when error occurs and goes to bail, and the restart logic
>     will not
>     > run. Codes like below:
>     > ...
>     >       status = ocfs2_wait_on_mount(osb);
>     >       if (status < 0) {
>     >               goto bail;
>     >       }
>     >
>     >       rm_quota = kzalloc(osb->max_slots * sizeof(int), GFP_NOFS);
>     >       if (!rm_quota) {
>     >               status = -ENOMEM;
>     >               goto bail;
>     >       }
>     > restart:
>     >       status = ocfs2_super_lock(osb, 1);
>     >       if (status < 0) {
>     >               mlog_errno(status);
>     >               goto bail;
>     >       }
>     > ...
>     >       if (!status && !ocfs2_recovery_completed(osb)) {
>     >               mutex_unlock(&osb->recovery_lock);
>     >               goto restart;
>     >       }
>     >
>     >
>     > _______________________________________________
>     > Ocfs2-devel mailing list
>     > Ocfs2-devel at oss.oracle.com <mailto:Ocfs2-devel at oss.oracle.com>
>     > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>     >
>     >
>     One more question, do we make sure dlm_recovery_thread always prior to
>     ocfs2_recovery_thread?
>
>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20130523/42aef9be/attachment.html