[Ocfs2-devel] [PATCH 1/3] fs: Document the reflink(2) system call.
Jamie Lokier
jamie at shareable.org
Tue May 5 15:30:16 PDT 2009
Joel Becker wrote:
> I think I'm confusing you. ocfs2 creates a new inode, with a
> new tree of extent blocks, pointing to the same data extents as the
> source. You can do *anything* POSIX to that new inode. You can chown
> it, chmod it, truncate it, futimes it, whatever. The only thing at
> issue is what the state of the inode is at the return of the reflink()
> call.
Ok, but does chown/chmod/futimes trigger a COW copy, unsharing the data?
This is still not clear. :-)
Behaviourally, whether a massive copy is triggered by chmod is quite a
significant thing. It dictates whether programs and scripts should be
careful to avoid chmod on reflinked files because it may very
expensive (think chmod triggering a 200GB copy), or can do so cheaply.
> I'm not defining reflink() as "creates a new inode" because I
> can see something like btrfs using the same storage inode with a new
> inode number until it needs to CoW. But from the user-visible
> perspective, that's exactly what happens.
I'm still not clear from the above explanation whether full data
unsharing (i.e. it's all copied, takes a long time, can trigger
ENOSPC) happens on chown/chmod etc.
But assuming it stays shared until you modify the actual data, could
the documentation make this important fact a bit more prominent:
reflink() creates a new file which initially shares the same
underlying data storage as the source file, and has all the same
attributes including security context and extended attributes.
After creating the new file, you can do *anything* POSIX to that
new file. You can chown it, chmod it, futimes it, truncate it,
write to it, whatever. When the data is modified, that will
trigger a copy-on-write operation so that the underlying data is
not completely shared any more.
The amount and timing of copying is filesystem-dependent, but only
happens when a data write or extended attribute change takes place.
Opening a file, reading it, read-only or private mappings, and
simple attribute updates (chown, chmod, futimes, as well as
automatic atime updates) will not trigger copy-on-write and will
not return ENOSPC errors.
Thanks,
-- Jamie
More information about the Ocfs2-devel
mailing list