[Ocfs2-devel] [PATCH] Treat writes as new when holes span across page boundaries

Wed Feb 23 13:35:36 PST 2011

Hi Joel,

On Wed, Feb 23, 2011 at 3:17 PM, Joel Becker <jlbec at evilplan.org> wrote:
> On Wed, Feb 23, 2011 at 02:54:19PM -0600, Goldwyn Rodrigues wrote:
>> On Wed, Feb 23, 2011 at 1:13 PM, Joel Becker <jlbec at evilplan.org> wrote:
>> >        I was about to think we were done, but now I'm not so sure.  I
>> > have a question: with my change to ocfs2_should_read_block(), but
>> > without your patch, what do you see for corruption?  Your pattern
>> > written in the first pass, or complete random garbage?
>>
>> Joel, you are getting confused in the list of parameters to reproduce
>> the bug. I have listed it in the past, but not in a single location so
>> it is difficult to pick up the bits. I will put it together for you.
>
> Goldwyn,
>        I'm just not remembering off the top of my head, so I'm asking.
> I'm actually describing the myriad cases we can encounter in this code,
> because any change we make has to be safe for all possible cases.
>
>> 1. blocksize < clustersize
>
>        Ok.
>
>> 2.  the previous write ended in a page boundary, but not a cluster
>> boundary. eg, clustersize=16K but pos is at the page boundary = 4k
>
>        Must it be a page boundary?  What about a block boundary? ;-)

I am not sure of this, but I would like to stick with page boundary. I
am too poor to afford a machine with a page size bigger than 4k to
test this ;)

>
>> 3. you wrote something at the page boundary
>
>        I think it's more block than page, really.  But that is the same
> for 4K blocks and pages.
>
>> 4. there is a hole created in the page from 4k-8k in *between* say 4128-4640
>
>        There is no hole here.  We have allocation.  Even if the user
> doesn't directly write to this location, it is not a hole, because we
> have a cluster there.  You just mean that the user didn't write to this
> location (sparse writing).  That's just a terminology thing, but it's
> important for other folks understanding you.

Agreed. hole in user-space? ;)

>
>> What I am seeing is pure junk(not 0xbaadfeed), and not what I had
>> written earlier.
>
>        Ok.  That's what I expected.  I wanted to clarify we were not
> reading stale data, but were instead failing to zero memory.
>
>> In retrospect, since we have covered the rest of the bases, we can set
>> new on the condition:
>>
>> i_size == page_offset == pos
>>
>> I interpreted new as new page or new data vs new cluster, and perhaps
>> that is the reason I placed the condition there.
>
>        I don't like this.  We still haven't figured out where we're
> failing.

What I meant with rest of the bases is the zeroing of pages when pos != i_size.

>        The core piece of this, why I'm going back and forth with you
> discussing the issue, is that we never want to paper over a symptom.
> Sure, we know that i_size==page_offset==pos triggers this, but is it the
> only case?  If we figure out *why* we're here, when the code is supposed
> to not have this problem, then we can be sure we've closed out all
> exposure.

I agree absolutely, and this is the reason I am in the discussion. All
I am saying from the start is that the (2nd-) page is never zeroed. We
just have to figure out who and where it has to be zeroed. I am not
keen on pushing my solution, but the correct one.

>        To wit: If we really only have a problem when i_size is on a
> page boundary, doesn't that sound like an off-by-one error?  We've had
> those before.  I'm not sure that it is, but shouldn't we be sure about
> it?
>

Yes it is. It is one of those corner cases which gets missed and we
should be sure of it.

-- 
Goldwyn