[Ocfs2-devel] [PATCH v2] ocfs2: fix a potential deadlock in dlm_reset_mleres_owner()
Changwei Ge
ge.changwei at h3c.com
Thu Dec 21 18:34:13 PST 2017
On 2017/12/21 14:36, alex chen wrote:
> Hi Joseph,
>
> On 2017/12/21 9:30, Joseph Qi wrote:
>> Hi Alex,
>>
>> On 17/12/21 08:55, alex chen wrote:
>>> In dlm_reset_mleres_owner(), we will lock
>>> dlm_lock_resource->spinlock after locking dlm_ctxt->master_lock,
>>> which breaks the spinlock lock ordering:
>>> dlm_domain_lock
>>> struct dlm_ctxt->spinlock
>>> struct dlm_lock_resource->spinlock
>>> struct dlm_ctxt->master_lock
>>>
>>> Fix it by unlocking dlm_ctxt->master_lock before locking
>>> dlm_lock_resource->spinlock and restarting to clean master list.
>>>
>>> Signed-off-by: Alex Chen <alex.chen at huawei.com>
>>> Reviewed-by: Jun Piao <piaojun at huawei.com>
>>> ---
>>> fs/ocfs2/dlm/dlmmaster.c | 14 ++++++++++----
>>> 1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
>>> index 3e04279..d83ccdc 100644
>>> --- a/fs/ocfs2/dlm/dlmmaster.c
>>> +++ b/fs/ocfs2/dlm/dlmmaster.c
>>> @@ -3287,16 +3287,22 @@ static struct dlm_lock_resource *dlm_reset_mleres_owner(struct dlm_ctxt *dlm,
>>> {
>>> struct dlm_lock_resource *res;
>>>
>>> + assert_spin_locked(&dlm->spinlock);
>>> + assert_spin_locked(&dlm->master_lock);
>>> +
>>> /* Find the lockres associated to the mle and set its owner to UNK */
>>> - res = __dlm_lookup_lockres(dlm, mle->mname, mle->mnamelen,
>>> + res = __dlm_lookup_lockres_full(dlm, mle->mname, mle->mnamelen,
>>> mle->mnamehash);
>>> if (res) {
>>> spin_unlock(&dlm->master_lock);
>>>
>>> - /* move lockres onto recovery list */
>>> spin_lock(&res->spinlock);
>>> - dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> - dlm_move_lockres_to_recovery_list(dlm, res);
>>> + if (!(res->state & DLM_LOCK_RES_DROPPING_REF)) {
>>> + /* move lockres onto recovery list */
>>> + dlm_set_lockres_owner(dlm, res, DLM_LOCK_RES_OWNER_UNKNOWN);
>>> + dlm_move_lockres_to_recovery_list(dlm, res);
>>> + }
>>> +
>> I don't think this change is lock re-ordering *only*. It definitely
>> changes the logic of resetting mle resource owner.
>> Why do you detach mle from heartbeat if lock resource is in the process
>> of dropping its mastery reference? And why we have to restart in this
>> case?
> I think if the lock resource is being purge we don't need to set its owner to UNKNOWN and
> it is the same as the original logic. We should drop the master lock if we want to judge
> if the state of the lock resource is DLM_LOCK_RES_DROPPING_REF. Once we drop the master lock
> we should restart to clean master list.
> Here the mle is not useful and will be released, so we detach it from heartbeat.
> In fact, the mle has been detached in dlm_clean_migration_mle().
Hi Alex,
Perhaps, you can just judge if lock resource is marked DLM_LOCK_RES_DROPPING_REF and if so directly
return NULL with ::master_lock locked :)
Thanks,
Changwei
>
> Thanks,
> Alex
>>
>> Thanks,
>> Joseph
>>
>>> spin_unlock(&res->spinlock);
>>> dlm_lockres_put(res);
>>>
>>
>> .
>>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
More information about the Ocfs2-devel
mailing list