[Ocfs2-devel] [PATCH] ocfs2: fix __ocfs2_cluster_lock() dead lock

Wengang Wang wen.gang.wang at oracle.com
Tue Jan 12 22:15:09 PST 2010


On 10-01-12 11:14, Sunil Mushran wrote:
> Wengang Wang wrote:
>> Is it a rule that If the lock is not currently taken it will return "go
>> ahead and downconvert"?
>
> In short, yes, because of fairness.
>
> Not having holders is one of the checks. If there are incompatible
> holders then it will requeue the lockres in the blocked queue. The
> lockres will be re-processed when the dc thread is kicked the
> next time. If there are no holders and the journal is checkpointed
> (or check_downconvert succeeds), then yes, it will be downconvert.
>

Ok, got it.
>> by checking ocfs2_check_meta_downconvert(), I think it ensures cache is
>> checkpointed by JBD(2) before returning "go ahead and downconvert".
>> maybe yes that you noticed that we can't have cache if we don't have
>> EX, I think the main purpose is that it ensures that we finish things which we
>> should finish before we release the lock. another evidence is that it
>> doesn't check the ex_holders.
>
> check_downconvert() is an additional check. The holders check tells the fs
> whether it can and whether it needs to down convert or not. If answer to
> both qs is yes, then it sees whether there are any lock type specific
> checks. M and T lock types want the journal check pointed before a dc.

yes, understood.
the holders check is done in ocfs2_unblock_lock() and the
check_downconvert() checks specific things.

whether the problem is a deadlock or a livelock depends on whether the
check_downconvert()(if there is) doesn't agree to dc in some condition.
if it doesn't agree, it's a deadlock. otherwise, it's a livelock.

>
>> another question is that you can find out ocfs2_check_meta_downconvert()
>> checks if lock is taken(not explicitly), can you make such a conclusion
>> that all check_downconvert() follows it? --sorry, I didn't check them
>> and pushed it to you :P --I will check them too later.
>>
>> so still the question, is it a rule that check_downconvert() should  
>> return "go ahead and downconvert" if the lock is not currently taken?
>> and also, what means "currently taken"? --ocfs2 layer(after
>> ocfs2_cluster_lock() returns) or dlm level? though I guess you meant
>> ocfs2 layer.
>
> I am confused by your qs. Have you read the description of  
> check_downconvert()?
>
>
>        /*
>         * Allow a lock type to add checks to determine whether it is
>         * safe to downconvert a lock. Return 0 to re-queue the
>         * downconvert at a later time, nonzero to continue.
>         *
>         * For most locks, the default checks that there are no
>         * incompatible holders are sufficient.
>         *
>         * Called with the lockres spinlock held.
>         */
>        int (*check_downconvert)(struct ocfs2_lock_res *, int);
>
> Sunil
got, please ignore the above part.

regards,
wengang.



More information about the Ocfs2-devel mailing list