[Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()

Changwei Ge ge.changwei at h3c.com
Thu Dec 21 18:34:13 PST 2017


On 2017/12/21 14:36, alex chen wrote:
> Hi Joseph,
> 
> On 2017/12/21 9:30, Joseph Qi wrote:
>> Hi Alex,
>>
>> On 17/12/21 08:55, alex chen wrote:
>>> In dlm_reset_mleres_owner(), we will lock
>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>> which breaks the spinlock lock ordering:
>>>   dlm_domain_lock
>>>   struct dlm_ctxt->spinlock
>>>   struct dlm_lock_resource->spinlock
>>>   struct dlm_ctxt->master_lock
>>>
>>> Fix it by unlocking dlm_ctxt->master_lock before locking
>>> dlm_lock_resource->spinlock and restarting to clean master list.
>>>
>>> Signed-off-by: Alex Chen <alex.chen at huawei.com>
>>> Reviewed-by: Jun Piao <piaojun at huawei.com>
>>> ---
>>>   fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>>> index 3e04279..d83ccdc 100644
>>> --- a/fs/ocfs2/dlm/dlmmaster.c
>>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>>   {
>>>   	struct dlm_lock_resource *res;
>>>
>>> +	assert_spin_locked(&dlm->spinlock);
>>> +	assert_spin_locked(&dlm->master_lock);
>>> +
>>>   	/* Find the lockres associated to the mle and set its owner to UNK */
>>> -	res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>>> +	res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>>   				   mle->mnamehash);
>>>   	if (res) {
>>>   		spin_unlock(&dlm->master_lock);
>>>
>>> -		/* move lockres onto recovery list */
>>>   		spin_lock(&res->spinlock);
>>> -		dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> -		dlm_move_lockres_to_recovery_list(dlm, res);
>>> +		if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>>> +			/* move lockres onto recovery list */
>>> +			dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> +			dlm_move_lockres_to_recovery_list(dlm, res);
>>> +		}
>>> +
>> I don't think this change is lock re-ordering *only*. It definitely
>> changes the logic of resetting mle resource owner.
>> Why do you detach mle from heartbeat if lock resource is in the process
>> of dropping its mastery reference? And why we have to restart in this
>> case?
> I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
> it is the same as the original logic. We should drop the master lock if we want to judge
> if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
> we should restart to clean master list.
> Here the mle is not useful and will be released, so we detach it from heartbeat.
> In fact, the mle has been detached in dlm_clean_migration_mle().
Hi Alex,

Perhaps, you can just judge if lock resource is marked  DLM_LOCK_RES_DROPPING_REF and if so directly
return NULL with ::master_lock locked :)

Thanks,
Changwei

> 
> Thanks,
> Alex
>>
>> Thanks,
>> Joseph
>>
>>>   		spin_unlock(&res->spinlock);
>>>   		dlm_lockres_put(res);
>>>
>>
>> .
>>
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 




More information about the Ocfs2-devel mailing list