[Ocfs2-devel] [PATCH V8 4/8] mm/fs: add hooks to support cleancache

Fri Apr 15 11:53:28 PDT 2011

> From: OGAWA Hirofumi [mailto:hirofumi at mail.parknet.co.jp]
> 
> Andrew Morton <akpm at linux-foundation.org> writes:
> 
> >> > Before I suggested a thing about cleancache_flush_page,
> >> > cleancache_flush_inode.
> >> >
> >> > what's the meaning of flush's semantic?
> >> > I thought it means invalidation.
> >> > AFAIC, how about change flush with invalidate?
> >>
> >> I'm not sure the words "flush" and "invalidate" are defined
> >> precisely or used consistently everywhere in computer
> >> science, but I think that "invalidate" is to destroy
> >> a "pointer" to some data, but not necessarily destroy the
> >> data itself.   And "flush" means to actually remove
> >> the data.  So one would "invalidate a mapping" but one
> >> would "flush a cache".
> >>
> >> Since cleancache_flush_page and cleancache_flush_inode
> >> semantically remove data from cleancache, I think flush
> >> is a better name than invalidate.
> >>
> >> Does that make sense?
> >
> > nope ;)
> >
> > Kernel code freely uses "flush" to refer to both invalidation and to
> > writeback, sometimes in confusing ways.  In this case,
> > cleancache_flush_inode and cleancache_flush_page rather sound like
> they
> > might write those things to backing store.
> 
> I'd like to mention about *_{get,put}_page too. In linux get/put is not
> meaning read/write. There is {get,put}_page those are refcount stuff
> (Yeah, and I felt those methods does refcount by quick read. But it
> seems to be false. There is no xen codes, so I don't know actually
> though.).
> 
> And I agree, I also think the needing thing is consistency on the linux
> codes (term).
> 
> Thanks.
> --
> OGAWA Hirofumi <hirofumi at mail.parknet.co.jp>

Hmmm, yes, that's a point of confusion also.  No, cleancache put/get
do not have any relationship with reference counting.

Andrew, I wonder if you would be so kind as to read the following
and make a "ruling".  If you determine a preferable set of names,
I will abide by your decision and repost (if necessary).

The problem is this: The English language has a limited number
of words that can be used to represent data motion and mapping
and most/all of them are already used in the kernel, often,
to quote Andrew, "in confusing ways."  Complicating this, I
think the semantics of the cleancache operations are different
from the semantics of any other kernel operation... intentionally
so, because the value of cleancache is a direct result of those
differing semantics.  And the cleancache semantics
are fairly complex (again intentionally) so a single function
name can't possibly describe the semantics.

The cleancache operations are:
- put (page)
- get (page)
- flush page
- flush inode
- init fs
- flush fs

I think these names are reasonable representations of the
semantics of the operations performed... but I'm not a kernel
expert so there is certainly room for disagreement.  Though I
absolutely recognize the importance of a "name", I am primarily
interested in merging the semantics of the operations and
would happily accept any name that kernel developers could
agree on.  However, I fear that there will be NO name that
will satisfy all, so would prefer to keep the existing names.
If some renaming is eventually agreed upon, this could be done
post-merge.

Here's a brief description of the semantics:

The cleancache operation currently known as "put" has the
following semantics:  If *possible*, please take the data
contained in the pageframe referred to by this struct page
into cleancache and associate it with the filesystem-determined
"handle" derived from the struct page.

The cleancache operation currently known as "get" has the
following semantics:  Derive the filesystem-determined handle
from this struct page.  If cleancache contains a page matching
that handle, recreate the page of data from cleancache and
place the results in the pageframe referred to by the
struct page.  Then delete in cleancache any record of the
handle and any data associated with it, so that a
subsequent "get" will no longer find a match for the handle;
any space used for the data can also be freed.

(Note that "take the data" and "recreate the page of data" are
similar in semantics to "copy to" and "copy from", but since
the cleancache operation may perform an "inflight" transformation
on the data, and "copy" usually means a byte-for-byte replication,
the word "copy" is also misleading.)

The cleancache operation currently known as "flush" has the
following semantics:  Derive the filesystem-determined handle
from this struct page and struct mapping.  If cleancache
contains a page matching that handle, delete in cleancache any
record of the handle and any data associated with it, so that a
subsequent "get" will no longer find a match for the handle;
any space used for the data can also be freed

The cleancache operation currently known as "flush inode" has
the following semantics: Derive the filesystem-determined filekey
from this struct mapping.  If cleancache contains ANY handles
matching that filekey, delete in cleancache any record of
any matching handle and any data associated with those handles;
any space used for the data can also be freed.

The cleancache operation currently known as "init fs" has
the following semantics: Create a unique poolid to refer
to this filesystem and save it in the superblock's
cleancache_poolid field.

The cleancache operation currently known as "flush fs" has
the following semantics: Get the cleancache_poolid field
from this superblock.  If cleancache contains ANY handles
associated with that poolid, delete in cleancache any
record of any matching handles and any data associated with
those handles; any space used for the data can also be freed.
Also, set the superblock's cleancache_poolid to be invalid
and, in cleancache, recycle the poolid so a subsequent init_fs
operation can reuse it.

That's all!

Thanks,
Dan