<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 2013/5/23 7:00, Sunil Mushran wrote:
<blockquote
cite="mid:CAEeiSHVmHJ55yWt26BgLL3aV7Mg+vnmKSqf2=HTErNR7YqziXw@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<div dir="ltr">True. The function could do with a little bit of
cleanup. Feel free to send a patch.<br>
</div>
</blockquote>
<br>
from ocfs2 code , I don't found that dlm_recovery_thread always
prior to<br>
ocfs2_recovery_thread? please tell me,thanks.<br>
<blockquote
cite="mid:CAEeiSHVmHJ55yWt26BgLL3aV7Mg+vnmKSqf2=HTErNR7YqziXw@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Sun, May 19, 2013 at 7:49 PM, Joseph
Qi <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:joseph.qi@huawei.com" target="_blank">joseph.qi@huawei.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb">
<div class="h5">On 2013/5/19 10:25, Joseph Qi wrote:<br>
> On 2013/5/18 21:26, Sunil Mushran wrote:<br>
>> The first node that gets the lock will do the
actual recovery. The others will get the lock and see a
clean journal and skip the recovery. A thread should
never error out if it fails to get the lock. It should
try and try again.<br>
>><br>
>> On May 17, 2013, at 11:27 PM, Joseph Qi <<a
moz-do-not-send="true"
href="mailto:joseph.qi@huawei.com">joseph.qi@huawei.com</a>>
wrote:<br>
>><br>
>>> Hi,<br>
>>> Once there is node down in the cluster,
ocfs2_recovery_thread will be<br>
>>> triggered on each node. These threads then
do the down node recovery by<br>
>>> get super lock.<br>
>>> I have several questions on this:<br>
>>> 1) Why each node has to run such a thread?
We know at last one node can<br>
>>> get the super lock and do the actual
recovery.<br>
>>> 2) If this thread is running but something
error occurred, take<br>
>>> ocfs2_super_lock failed for example, the
thread will exit without<br>
>>> clearing recovery map, will it cause other
threads still waiting for<br>
>>> recovery in ocfs2_wait_for_recovery?<br>
>>><br>
>><br>
>><br>
> But when error occurs and goes to bail, and the
restart logic will not<br>
> run. Codes like below:<br>
> ...<br>
> status = ocfs2_wait_on_mount(osb);<br>
> if (status < 0) {<br>
> goto bail;<br>
> }<br>
><br>
> rm_quota = kzalloc(osb->max_slots *
sizeof(int), GFP_NOFS);<br>
> if (!rm_quota) {<br>
> status = -ENOMEM;<br>
> goto bail;<br>
> }<br>
> restart:<br>
> status = ocfs2_super_lock(osb, 1);<br>
> if (status < 0) {<br>
> mlog_errno(status);<br>
> goto bail;<br>
> }<br>
> ...<br>
> if (!status &&
!ocfs2_recovery_completed(osb)) {<br>
>
mutex_unlock(&osb->recovery_lock);<br>
> goto restart;<br>
> }<br>
><br>
><br>
</div>
</div>
> _______________________________________________<br>
> Ocfs2-devel mailing list<br>
> <a moz-do-not-send="true"
href="mailto:Ocfs2-devel@oss.oracle.com">Ocfs2-devel@oss.oracle.com</a><br>
> <a moz-do-not-send="true"
href="https://oss.oracle.com/mailman/listinfo/ocfs2-devel"
target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-devel</a><br>
><br>
><br>
One more question, do we make sure dlm_recovery_thread
always prior to<br>
ocfs2_recovery_thread?<br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Ocfs2-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Ocfs2-devel@oss.oracle.com">Ocfs2-devel@oss.oracle.com</a>
<a class="moz-txt-link-freetext" href="https://oss.oracle.com/mailman/listinfo/ocfs2-devel">https://oss.oracle.com/mailman/listinfo/ocfs2-devel</a></pre>
</blockquote>
<br>
</body>
</html>