[Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer

Thu Nov 7 18:19:55 PST 2013

On 2013/11/7 21:19, Joel Becker wrote:
> On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
>> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>>
>> The case is described below.
>> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
>> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
>> than the inode number of dir race.
>>
>> Node A                            Node B
>> mv /race/16/1 /race/
>>                                   right after Node A has got the
>>                                   EX mode of /race/16/, and tries to
>>                                   get EX mode of /race/
>>                                   ls /race/16/
>>
>> In this case, Node A has got the EX mode of /race/16/, and wants to get
>> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
>> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
>> dead lock happens.
> 
> Interesting.  What DLM are you using, o2dlm or fs/dlm?  I would expect
> that fs/dlm would do deadlock detection, but I could be wrong.
>
We are using ocfs2 dlm.

> There's no way the PR on /race/ will downconvert, because there is a
> reference.  We really want a signal to that PR waiting on /race/16/, but
> there's no in-progress work happening on node A for that.
> 
Can timeout resolve this issue?
A glancing thought is, once timeout happens, cancel the queued lock and
let it lock operation fail.

> I suppose we could hack this to check for ancestors.  That is, rename
> locks should be in ancestor order before trying inode number order.  I'm
> not sure that always works, though, especially if the ancestors are not
> consecutive and might also be affected by in-flight moves...
> 
> Joel
>