[Ocfs2-devel] [PATCH 19/41] ocfs2: Integrate CoW in file write.
Tao Ma
tao.ma at oracle.com
Thu Aug 20 19:12:27 PDT 2009
Hi Joel,
Joel Becker wrote:
> On Tue, Aug 18, 2009 at 02:19:20PM +0800, Tao Ma wrote:
>> @@ -1674,6 +1692,7 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,
>> struct ocfs2_alloc_context *meta_ac = NULL;
>> handle_t *handle;
>> struct ocfs2_extent_tree et;
>> + unsigned int refcounted_cpos, cow_len;
>>
>> ret = ocfs2_alloc_write_ctxt(&wc, osb, pos, len, di_bh);
>> if (ret) {
>> @@ -1701,12 +1720,36 @@ int ocfs2_write_begin_nolock(struct address_space *mapping,
>> }
>>
>> ret = ocfs2_populate_write_desc(inode, wc, &clusters_to_alloc,
>> - &extents_to_split);
>> - if (ret) {
>> + &extents_to_split, &refcounted_cpos);
>> + if (ret && ret != -ETXTBSY) {
>> mlog_errno(ret);
>> goto out;
>> }
>>
>> + if (ret == -ETXTBSY) {
>> + BUG_ON(refcounted_cpos == UINT_MAX);
>> + cow_len = wc->w_clen - (refcounted_cpos - wc->w_cpos);
>> +
>> + ret = ocfs2_refcount_cow(inode, di_bh,
>> + refcounted_cpos, cow_len);
>> + if (ret) {
>> + mlog_errno(ret);
>> + goto out;
>> + }
>> +
>> + /* reinitialize write_desc and populate it again. */
>> + ocfs2_clear_write_desc(wc);
>> + ret = ocfs2_populate_write_desc(inode, wc, &clusters_to_alloc,
>> + &extents_to_split,
>> + &refcounted_cpos);
>> + if (ret) {
>> + mlog_errno(ret);
>> + goto out;
>> + }
>> +
>> + BUG_ON(refcounted_cpos != UINT_MAX);
>> + }
>
> First see my comment on the ocfs2_refcount_cal_clusters() and a
> possible short CoW.
> Basically, ocfs2_refcount_cow() can do a short CoW in two
> situations. First, the one I described about
> ocfs2_refcount_cal_clusters(). The second is if the write wants to
> cover three extents, the middle of which is not refcounted:
>
> |--refcounted--|--not refcounted--]--refcounted--|
>
> ocfs2_refcount_cal_cow_clusters() will be called on the first extent.
> It will set up *cow_len to cover that first extent, but then it will hit
> the second extent and break out. ocfs2_refcount_cow() will then replace
> the first extent. We end up with:
>
> |--not refcounted--|--not refcounted--]--refcounted--|
>
>
> Now we call to ocfs2_prepare_write_desc() again. It will see
> the first and second extents as not refcounted, but then it will hit
> the third extent and return -ETXTBSY. refcounted_cpos will be set to
> the third extent. Then in ocfs2_write_begin_nolock(), we'll hit:
>
> BUG_ON(refcounted_cpos != UINT_MAX);
>
> Am I missing something?
you are right.
ocfs2_write_begin normally only write one page and with my x86_64 box,
we have PAGE_SIZE <= CLUSTER_SIZE, so we never hit this issue before.
So maybe I have to add a new function which will iterate from pos to
pos+len to make sure all the refcounted extents are CoWed. Thanks.
Regards,
Tao
More information about the Ocfs2-devel
mailing list