[Ocfs2-devel] [RFC] The reflink(2) system call v4.

Sage Weil sage at newdream.net
Tue May 12 21:30:22 PDT 2009


On Tue, 12 May 2009, Joel Becker wrote:
> 	I'm not against two syscalls, but I'm not writing copyfile()
> here, just reflink().  Someone clearly could write copyfile() later and
> link into some of the same underlying mechanisms.

Ok, good.

> 	It's important to distinguish the semantics, though, and that's
> why I'm doing one thing.  For example, reflink() is a snapshot (a
> "reference-counted link") and has behaviors based on that.  libc should
> never fake it, because the callers expect those behaviors.  Whereas
> copyfile() would be fakeable in libc with a read/write cycle on
> filesystems that don't support it.  Things like that.
> 	Heck, I think you could use reflink() to create a copyfile() in
> libc that uses no additional syscall.  But you couldn't use copyfile()
> to create reflink().

Right, except that you _could_ implement the degraded (no CAP_CHOWN) 
reflink() behavior with a hypothetical copyfile().

I just think you should be sure that reflink() has _exactly_ the snapshot 
semantics that make sense, without compromises that try to capture some or 
all of copyfile() as well.  Assuming that a copyfile() type syscall also 
existed, would you really want reflink() to silently degrade to something 
that can be implemented via copyfile() when you lack CAP_CHOWN?

With the proposed reflink(), we might end up with a final API that looks 
something like:

 cowfile(src, dst, flags) - cow data and/or xattrs from src to dst
 reflink(src, dst) - snapshot src to dst, or if !CAP_CHOWN, cowfile() instead

A simpler reflink() would make that degradation non-mandatory, and 
trivially implemented in userspace by those who want it.

sage



More information about the Ocfs2-devel mailing list