[Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?
Junxiao Bi
junxiao.bi at oracle.com
Tue Oct 18 23:57:35 PDT 2016
Hi Eric,
> 在 2016年10月19日,下午1:19,Eric Ren <zren at suse.com> 写道:
>
> Hi all!
>
> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> results in another deadlock as we have discussed in the recent thread:
> https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html
>
> Before this one, a similiar deadlock has been fixed by Junxiao:
> commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
> commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")
>
> We are in the situation that we have to avoid recursive cluster locking, but
> there is no way to check if a cluster lock has been taken by a precess already.
>
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
>
> const struct inode_operations ocfs2_file_iops = {
> .permission = ocfs2_permission,
> .get_acl = ocfs2_iop_get_acl,
> .set_acl = ocfs2_iop_set_acl,
> };
>
>
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().
>
> Possibly, there are three solutions I can think of. The first one is to
> implement the inode permission routine for ocfs2 itself, replacing the
> existing generic_permission(); this will bring lots of changes and
> involve too many trivial vfs functions into ocfs2 code. Frown on this.
>
> The second one is, what I am trying now, to keep track of the processes who
> lock/unlock a cluster lock by the following draft patches. But, I quickly
> find out that a cluster locking which has been taken by processA can be unlocked
> by processB. For example, systemfiles like journal:0000 is locked during mout, and
> unlocked during umount.
I had ever implemented generic recursive locking support, please check the patch at https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html <https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html> , the issue that locking and unlocking in different processes was considered. But it was rejected by Mark as recursive locking is not allowed in ocfs2/kernel .
>
> The thrid one is to revert that problematic commit! It looks like get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)
Not sure whether get/set_acl() will be called directly by vfs. Even not now, we can’t make sure that in the future. So revert it may be a little risky. But if refactor is complicated, then this maybe the only way we can do.
Thanks,
Junxiao.
>
> Hope for your input to solve this problem;-)
>
> Thanks,
> Eric
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20161019/324b727e/attachment.html
More information about the Ocfs2-devel
mailing list