[Ocfs2-devel] [PATCH] ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file

Darrick J. Wong darrick.wong at oracle.com
Wed Feb 12 17:53:43 PST 2014


On Wed, Feb 12, 2014 at 02:58:18PM -0800, Mark Fasheh wrote:
> On Wed, Jan 29, 2014 at 07:48:48PM -0800, Darrick J. Wong wrote:
> > Currently, ocfs2_sync_file grabs i_mutex and forces the current
> > journal transaction to complete.  This isn't terribly efficient, since
> > sync_file really only needs to wait for the last transaction involving
> > that inode to complete, and this doesn't require i_mutex.
> > 
> > Therefore, implement the necessary bits to track the newest tid
> > associated with an inode, and teach sync_file to wait for that instead
> > of waiting for everything in the journal to commit.  Furthermore, only
> > issue the flush request to the drive if jbd2 hasn't already done so.
> > 
> > This also eliminates the deadlock between ocfs2_file_aio_write() and
> > ocfs2_sync_file().  aio_write takes i_mutex then calls
> > ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
> > However, if that dio completion involves calling fsync, then we can
> > get into trouble when some ocfs2_sync_file tries to take i_mutex.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong at oracle.com>
> 
> Ok, I can see what the patch is doing, but I have a silly question - where
> exactly are we marking the latest transaction id during a write system call?

The patch should update those tids every time anything touches the inode --
block allocations, truncates, attribute/timestamp updates, etc.  For rewriting
a disk block without touching the inode, ocfs2_sync_file will either wait for
the last transaction involving the inode to commit + flush, or if the
transaction has long since been committed, it will issue the flush directly
without needing a recent tid.

Does that help?  I /think/ I covered all the cases where I need to update the
tid.

--D
> 	--Mark
> 
> --
> Mark Fasheh



More information about the Ocfs2-devel mailing list