[Ocfs2-devel] [GIT PULL] ocfs2 changes for 2.6.32

Linus Torvalds torvalds at linux-foundation.org
Tue Sep 15 09:30:54 PDT 2009



On Mon, 14 Sep 2009, Joel Becker wrote:
> > 
> > If you're talking about falling back to manually just copying the data, 
> > then nobody is interested in that. User space can do that better with a 
> > simple read-write loop or with splice, or whatever. There's no reaason 
> > what-so-ever to do that.
> 
> 	I'm talking about any facility for copying that isn't just a
> userspace loop.  Like your discussion of network filesystems.

HOW?

We need to have a per-filesystem interface to that. 

Having a '->copyfile()' function would be great.

But don't you see how _idiotic_ it is to then also having a '->reflink()' 
function that does _conceptually_ the exact same thing, except it does it 
by incrementing a usage count instead?

Do you see why I'm so unhappy to add a ->reflink() function? 

> 	Hence I brought this to the filesystem summit and then fsdevel
> rather than just implementing it in ocfs2.  I know NFS folks were in the
> room in April, and they said the call definition was workable.  Can't
> remember if CIFS folks were there, but I think so.

It's not workable if you define the 'reflink()' function to not use any 
disk space on the filesystem. Because SMB _will_ do a copy (and I presume 
the NFS thing will too). So it would not in general be what you call 
reflink, it will not be a "snapshot".

So if you _define_ the semantics of "reflink" to be that it's atomic and 
doesn't use any new diskspace (apart from the new inode/directory entry, 
of course), then it will be almost totally useless to other filesystems.

In fact, it's entirely possible to have filesystems that can avoid copying 
the _data_ blocks, but would need to copy the indirect blocks - maybe the 
data blocks are ref-counted, but the metadata needs to be per-file (I can 
see many reasons to do it that way, even if it's organized as a tree - 
it's how we do page table COW, for example, and it makes some things much 
simpler).

Would that be a 'reflink()' or not? I have no way of knowing, because you 
have decided on reflink on a purely ocfs2-specific implementation basis. 
But I do know that such a filesystem would be perfectly happy to have a 
'copyfile' function.

This is why I want the VFS pointers to be about _semantics_, not about 
some random implementation detail.

			Linus



More information about the Ocfs2-devel mailing list