[Ocfs2-devel] New update for sync in reflink. [Re: some thoughts about data integrity test for reflink.]
Tao Ma
tao.ma at oracle.com
Mon Aug 10 18:29:45 PDT 2009
Joel Becker wrote:
> On Sat, Aug 08, 2009 at 10:28:34PM +0800, Tao Ma wrote:
>>> I do not. Having to call ocfs2_sync_inode() for every reflink
>>> is a disaster. We need to think about this. Basically, how can the
>>> reflink target notice it has to either sync the source or perhaps copy
>>> the data over?
>>>
>> OK, I agree with you for ocfs2_sync_inode part. I am just a little
>> worried about the above case I mentioned, so if we can
>> resolve it gracefully, I would be very glad about it.
>
> Well, I can't find a good solution. But we're not going to use
> ocfs2_sync_inode(), as that syncs metadata, which is pointless for
> reflink. Also, we need to code up waiting on the sync
> (ocfs2_sync_inode() starts the sync, but does not wait on it). We need
> to start the sync as early as possible.
> As an aside, ocfs2_sync_inode() is a completely useless
> function. filemap_fdatawrite() is called by the vfs fsync code, and we
> don't use the mapping buffers so sync_mapping_buffers() is a noop. A
> future patch should just remove ocfs2_sync_inode().
> So, here's what needs to happen. At the very top of
> __ocfs2_reflink(), right after you have ip_alloc_sem and have locked out
> concurrent writers, you need to call filemap_fdatawrite(). This will
> start the writeback in the background while you go on to building the
> refcount trees. At the very bottom of __ocfs2_reflink(), before you
> drop the locks on the new_inode, you then filemap_fdatawait().
We may call filemap_fdatawait after we drop the locks of new_inode since
it is just about the old inode's data writeback(And actually at the very
top of __ocfs2_reflink where we call filemap_fdatawrite, we don't have
i_mutex of new inode locked, so these 2 functions are both outside of
new_inode's i_mutex lock)? And what' more, now the new inode is still in
orphan dir, so other nodes or process should not be able to touch its
data at that time.
> I think that this means we can drop ocfs2_cow_sync_writeback()
> because the data must have been up to date before the reflink could have
> been CoWd. We can also drop the filemap_fdatawrite_range() in
> ocfs2_refcount_cow(). In ocfs2_duplicate_clusters() PageDirty() is now
> a BUG_ON(). Right?
ocfs2_cow_sync_writeback is used to sync data to the new clusters after
CoW in writeback mode, so it can't be removed.
As for filemap_fdatawrite_range and PageDirty, you are right, we can
remove them and add BUG_ON if we meet with PageDirty.
Regards,
Tao
More information about the Ocfs2-devel
mailing list