[Ocfs2-devel] [patch 04/11] ocfs2: fix a tiny race when running dirop_fileop_racer
Mark Fasheh
mfasheh at suse.de
Wed Feb 5 15:31:06 PST 2014
On Fri, Jan 24, 2014 at 12:47:03PM -0800, akpm at linux-foundation.org wrote:
> From: Yiwen Jiang <jiangyiwen at huawei.com>
> Subject: ocfs2: fix a tiny race when running dirop_fileop_racer
>
> When running dirop_fileop_racer we found a dead lock case.
>
> 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create
> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
> than the inode number of dir race.
>
> Node A Node B
> mv /race/16/1 /race/
> right after Node A has got the
> EX mode of /race/16/, and tries to
> get EX mode of /race
> ls /race/16/
>
> In this case, Node A has got the EX mode of /race/16/, and wants to get EX
> mode of /race/. Node B has got the PR mode of /race/, and wants to get
> the PR mode of /race/16/. Since EX and PR are mutually exclusive, dead
> lock happens.
I am confused as to how this race happens.
Something like "ls /race/16' shouldn't hold locks on 'race' and '16' at the
same time. It should look more like:
<userspace does readdir /race/16>
PR race
<kernel looks up '16' in 'race'>
Unlock PR race
PR 16
<get dirents from '16'>
Unlock PR 16
<return dirents to userspace>
Can you please explain where I may be going wrong? Also an strace of the
locked up 'ls' as well as the output of sysrq-t when it's deadlocked would
help show what's going on.
--Mark
--
Mark Fasheh
More information about the Ocfs2-devel
mailing list