[Ocfs2-devel] [RFC] The reflink(2) system call v4.
Sage Weil
sage at newdream.net
Tue May 12 21:30:22 PDT 2009
On Tue, 12 May 2009, Joel Becker wrote:
> I'm not against two syscalls, but I'm not writing copyfile()
> here, just reflink(). Someone clearly could write copyfile() later and
> link into some of the same underlying mechanisms.
Ok, good.
> It's important to distinguish the semantics, though, and that's
> why I'm doing one thing. For example, reflink() is a snapshot (a
> "reference-counted link") and has behaviors based on that. libc should
> never fake it, because the callers expect those behaviors. Whereas
> copyfile() would be fakeable in libc with a read/write cycle on
> filesystems that don't support it. Things like that.
> Heck, I think you could use reflink() to create a copyfile() in
> libc that uses no additional syscall. But you couldn't use copyfile()
> to create reflink().
Right, except that you _could_ implement the degraded (no CAP_CHOWN)
reflink() behavior with a hypothetical copyfile().
I just think you should be sure that reflink() has _exactly_ the snapshot
semantics that make sense, without compromises that try to capture some or
all of copyfile() as well. Assuming that a copyfile() type syscall also
existed, would you really want reflink() to silently degrade to something
that can be implemented via copyfile() when you lack CAP_CHOWN?
With the proposed reflink(), we might end up with a final API that looks
something like:
cowfile(src, dst, flags) - cow data and/or xattrs from src to dst
reflink(src, dst) - snapshot src to dst, or if !CAP_CHOWN, cowfile() instead
A simpler reflink() would make that degradation non-mandatory, and
trivially implemented in userspace by those who want it.
sage
More information about the Ocfs2-devel
mailing list