[Ocfs2-devel] ocfs2: a dead lock case when running dirop_fileop_racer
Joseph Qi
joseph.qi at huawei.com
Thu Nov 7 18:19:55 PST 2013
On 2013/11/7 21:19, Joel Becker wrote:
> On Thu, Nov 07, 2013 at 08:12:02PM +0800, Joseph Qi wrote:
>> We ran ocfs2 test program dirop_fileop_racer and found a dead lock case.
>>
>> The case is described below.
>> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
>> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
>> than the inode number of dir race.
>>
>> Node A Node B
>> mv /race/16/1 /race/
>> right after Node A has got the
>> EX mode of /race/16/, and tries to
>> get EX mode of /race/
>> ls /race/16/
>>
>> In this case, Node A has got the EX mode of /race/16/, and wants to get
>> EX mode of /race/. Node B has got the PR mode of /race/, and wants to
>> get the PR mode of /race/16/. Since EX and PR are mutually exclusive,
>> dead lock happens.
>
> Interesting. What DLM are you using, o2dlm or fs/dlm? I would expect
> that fs/dlm would do deadlock detection, but I could be wrong.
>
We are using ocfs2 dlm.
> There's no way the PR on /race/ will downconvert, because there is a
> reference. We really want a signal to that PR waiting on /race/16/, but
> there's no in-progress work happening on node A for that.
>
Can timeout resolve this issue?
A glancing thought is, once timeout happens, cancel the queued lock and
let it lock operation fail.
> I suppose we could hack this to check for ancestors. That is, rename
> locks should be in ancestor order before trying inode number order. I'm
> not sure that always works, though, especially if the ancestors are not
> consecutive and might also be affected by in-flight moves...
>
> Joel
>
More information about the Ocfs2-devel
mailing list