[Ocfs2-devel] Bug in inode deletion code leading to stale inodes
Jan Kara
jack at suse.cz
Mon Jan 12 14:06:35 PST 2009
Hello,
I've hit a bug in OCFS2 delete code which results in inodes being left on
disk without any links to them. The workload triggering this creates
directories on one node and deletes them on another node in the cluster.
The inode is not deleted because both nodes bail out from
ocfs2_delete_inode() with:
Skipping delete of 100405 because it is in use on other nodes
The scenario which I think is happening is as follows:
node1 node2
rmdir("d");
ocfs2_remote_dentry_delete()
ocfs2_dentry_convert_worker()
finishes ocfs2_unlink()
eventually enters ocfs2_delete_inode()
ocfs2_inode_lock()
ocfs2_query_inode_wipe() -> fail
ocfs2_inode_unlock()
ocfs2_dentry_post_unlock()
ocfs2_drop_dentry_lock()
iput()
ocfs2_delete_inode()
ocfs2_inode_lock()
ocfs2_query_inode_wipe() -> fail
ocfs2_inode_unlock()
clear_inode()
clear_inode()
The question is how to avoid this. It seems to me that we have to really
do open_lock() and not just trylock to avoid the race. Is there any reason
why we cannot move the open_lock() before inode_lock() in
ocfs2_delete_inode()?
Honza
--
Jan Kara <jack at suse.cz>
SUSE Labs, CR
More information about the Ocfs2-devel
mailing list