[Ocfs2-devel] [PATCH] ocfs2/dlm: wait for dlm recovery done when migrating all lockres
Changwei Ge
ge.changwei at h3c.com
Wed Nov 1 01:11:04 PDT 2017
Hi Jun,
Thanks for reviewing.
I don't think we have to worry about misusing *dlm_domain_lock* and
*dlm::spinlock*. I admit my change may look a little different from most
of other code snippets where using these two spin locks. But our purpose
is to close the race between __dlm_hb_node_down and
dlm_unregister_domain, right? And my change meets that. :-)
I suppose we can do it in a flexible way.
Thanks,
Changwei
On 2017/11/1 15:57, piaojun wrote:
> Hi Changwei,
>
> thanks for reviewing, and I think waiting for recoverying done before
> migrating seems another solution, but I wonder if new problems will be
> invoked as following comments.
>
> thanks,
> Jun
>
> On 2017/11/1 15:13, Changwei Ge wrote:
>> Hi Jun,
>>
>> I probably get your point.
>>
>> You mean that dlm finds no lock resource to be migrated and no more lock
>> resource is managed by its hash table. After that a node dies all of a
>> sudden and the dead node is put into dlm's recovery map, right?
> that is it.
>> Furthermore, a lock resource is migrated from other nodes and local node
>> has already asserted master to them.
>>
>> If so, I want to suggest a easier way to solve it.
>> We don't have to add a new flag to dlm structure, just leverage existed
>> dlm status and bitmap.
>> It will bring a bonus - no lock resource will be marked with RECOVERING,
>> it's a safer way, I suppose.
>>
>> Please take a review.
>>
>> Thanks,
>> Changwei
>>
>>
>> Subject: [PATCH] ocfs2/dlm: a node can't be involved in recovery if it
>> is being shutdown
>>
>> Signed-off-by: Changwei Ge <ge.changwei at h3c.com>
>> ---
>> fs/ocfs2/dlm/dlmdomain.c | 4 ++++
>> fs/ocfs2/dlm/dlmrecovery.c | 3 +++
>> 2 files changed, 7 insertions(+)
>>
>> diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c
>> index a2b19fbdcf46..5e9283e509a4 100644
>> --- a/fs/ocfs2/dlm/dlmdomain.c
>> +++ b/fs/ocfs2/dlm/dlmdomain.c
>> @@ -707,11 +707,15 @@ void dlm_unregister_domain(struct dlm_ctxt *dlm)
>> * want new domain joins to communicate with us at
>> * least until we've completed migration of our
>> * resources. */
>> + spin_lock(&dlm->spinlock);
>> dlm->dlm_state = DLM_CTXT_IN_SHUTDOWN;
>> + spin_unlock(&dlm->spinlock);
> I guess there will be misuse of 'dlm->spinlock' and dlm_domain_lock.
>> leave = 1;
>> }
>> spin_unlock(&dlm_domain_lock);
>>
>> + dlm_wait_for_recovery(dlm);
>> +
>> if (leave) {
>> mlog(0, "shutting down domain %s\n", dlm->name);
>> dlm_begin_exit_domain(dlm);
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index 74407c6dd592..764c95b2b35c 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -2441,6 +2441,9 @@ static void __dlm_hb_node_down(struct dlm_ctxt
>> *dlm, int idx)
>> {
>> assert_spin_locked(&dlm->spinlock);
>>
>> + if (dlm->dlm_state == DLM_CTXT_IN_SHUTDOWN)
>> + return;
>> +
> 'dlm->dlm_state' probably need be to protected by 'dlm_domain_lock'.
> and I wander if there is more work to be done when in
> 'DLM_CTXT_IN_SHUTDOWN'?
>> if (dlm->reco.new_master == idx) {
>> mlog(0, "%s: recovery master %d just died\n",
>> dlm->name, idx);
>>
>
More information about the Ocfs2-devel
mailing list