[Ocfs2-devel] [PATCH] ocfs2/dlm: cancel the migration or redo deref to recovery master

Srinivas Eeda srinivas.eeda at oracle.com
Sat Jun 5 18:48:55 PDT 2010


On 6/3/2010 10:37 PM, Wengang Wang wrote:
> Srini,
>
> On 10-06-03 19:17, Srinivas Eeda wrote:
>   
>>>> Can you please explain the idea of the new flag
>>>> DLM_LOCK_RES_DE_DROP_REF :)
>>>>
>>>> If the idea of the fix is to address the race between purging and
>>>> recovery, I am wondering DLM_LOCK_RES_DROPPING_REF and
>>>> DLM_LOCK_RES_RECOVERING flags may be enough to fix this problem.
>>>> dlm_move_lockres_to_recovery_list moves lockres to resources list
>>>> (which tracks of resources that needs recovery) and sets the flag
>>>> DLM_LOCK_RES_RECOVERING. If we do not call
>>>> dlm_move_lockres_to_recovery_list for the resource which have
>>>> DLM_LOCK_RES_DROPPING_REF set they will not get migrated. In that
>>>> case DLM_LOCK_RES_RECOVERING will not get set and the recovery
>>>> master wouldn't know about this and the lockres that is in the
>>>> middle of purging will get purged.
>>>>
>>>> For the lockres that got moved to resource list they will get
>>>> migrated. In that case lockres has DLM_LOCK_RES_RECOVERING.flag set.
>>>> So dlm_purge_list should consider this as being used and should
>>>> defer purging. the lockres will get recovered and the new owner will
>>>> be set and the flag DLM_LOCK_RES_RECOVERING. will get removed.
>>>> dlm_purge_list can now go ahead and purge this lockres.
>>>>
>>>>         
>>> I am following your idea. Addtion to your idea is that we also notice
>>> that we shouldn't send the DEREF request to the recovery master if we
>>> don't migrate the lockres to the recovery master(otherwise, another
>>> BUG() is triggered). DLM_LOCK_RES_DE_DROP_REF is for that purpose.
>>> When
>>> we ignore migrating a lockres, we set this state.
>>>
>>>       
>> The case we don't migrate the lockres is only when it's dropping the
>> reference right(when DLM_LOCK_RES_DROPPING_REF is set). In that case we
>> just unhash and free the lockres.
>>     
>
> How do you determine whether it's in "that case" in code? I determine that by
> checking the DLM_LOCK_RES_DE_DROP_REF state.
>   
If DLM_LOCK_RES_DROPPING_REF is not set for the lockres, then it can get 
migrated(even if it's on the purge list).
> regards,
> wengang.
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20100605/c543099c/attachment.html 


More information about the Ocfs2-devel mailing list