[Ocfs2-devel] [patch 04/11] ocfs2: fix a tiny race when running dirop_fileop_racer

Mark Fasheh mfasheh at suse.de
Wed Feb 5 15:31:06 PST 2014


On Fri, Jan 24, 2014 at 12:47:03PM -0800, akpm at linux-foundation.org wrote:
> From: Yiwen Jiang <jiangyiwen at huawei.com>
> Subject: ocfs2: fix a tiny race when running dirop_fileop_racer
> 
> When running dirop_fileop_racer we found a dead lock case.
> 
> 2 nodes, say Node A and Node B, mount the same ocfs2 volume.  Create
> /race/16/1 in the filesystem, and let the inode number of dir 16 is less
> than the inode number of dir race.
> 
> Node A                            Node B
> mv /race/16/1 /race/
>                                   right after Node A has got the
>                                   EX mode of /race/16/, and tries to
>                                   get EX mode of /race
>                                   ls /race/16/
> 
> In this case, Node A has got the EX mode of /race/16/, and wants to get EX
> mode of /race/.  Node B has got the PR mode of /race/, and wants to get
> the PR mode of /race/16/.  Since EX and PR are mutually exclusive, dead
> lock happens.

I am confused as to how this race happens.

Something like "ls /race/16' shouldn't hold locks on 'race' and '16' at the
same time. It should look more like:

<userspace does readdir /race/16>
PR race
<kernel looks up '16' in 'race'>
Unlock PR race
PR 16
<get dirents from '16'>
Unlock PR 16
<return dirents to userspace>

Can you please explain where I may be going wrong? Also an strace of the
locked up 'ls' as well as the output of sysrq-t when it's deadlocked would
help show what's going on.
	--Mark

--
Mark Fasheh



More information about the Ocfs2-devel mailing list