[Ocfs2-devel] [PATCH] ocfs2: improve fsync efficiency and fix deadlock between aio_write and sync_file
Darrick J. Wong
darrick.wong at oracle.com
Wed Feb 12 17:53:43 PST 2014
On Wed, Feb 12, 2014 at 02:58:18PM -0800, Mark Fasheh wrote:
> On Wed, Jan 29, 2014 at 07:48:48PM -0800, Darrick J. Wong wrote:
> > Currently, ocfs2_sync_file grabs i_mutex and forces the current
> > journal transaction to complete. This isn't terribly efficient, since
> > sync_file really only needs to wait for the last transaction involving
> > that inode to complete, and this doesn't require i_mutex.
> >
> > Therefore, implement the necessary bits to track the newest tid
> > associated with an inode, and teach sync_file to wait for that instead
> > of waiting for everything in the journal to commit. Furthermore, only
> > issue the flush request to the drive if jbd2 hasn't already done so.
> >
> > This also eliminates the deadlock between ocfs2_file_aio_write() and
> > ocfs2_sync_file(). aio_write takes i_mutex then calls
> > ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
> > However, if that dio completion involves calling fsync, then we can
> > get into trouble when some ocfs2_sync_file tries to take i_mutex.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong at oracle.com>
>
> Ok, I can see what the patch is doing, but I have a silly question - where
> exactly are we marking the latest transaction id during a write system call?
The patch should update those tids every time anything touches the inode --
block allocations, truncates, attribute/timestamp updates, etc. For rewriting
a disk block without touching the inode, ocfs2_sync_file will either wait for
the last transaction involving the inode to commit + flush, or if the
transaction has long since been committed, it will issue the flush directly
without needing a recent tid.
Does that help? I /think/ I covered all the cases where I need to update the
tid.
--D
> --Mark
>
> --
> Mark Fasheh
More information about the Ocfs2-devel
mailing list