[Btrfs-users] Thoughts about filesystem undo
Phil Endecott
spam_from_btrfs at chezphil.org
Wed Sep 26 14:55:00 PDT 2007
Hi Chris, thanks for replying.
Chris Mason wrote:
> On Tue, 2007-09-25 at 23:45 +0100, Phil Endecott wrote:
>> Dear Btrfs people,
>>
>> I saw Chris' Btrfs talk at LinuxConf.EU a few weeks ago and have since
>> been thinking about how I would like to use this great code once you
>> have done all the hard work :-)
>>
>> Fine-grain filesystem undo, thanks to cheap snapshots, is what I'm
>> thinking about. The more I consider it the more useful I believe it
>> will be
[snip]
>> So I was wondering if you have thought about how this could be made to
>> work, from the user's (or application developer's) viewpoint rather
>> than in terms of the filesystem implementation. Certainly, more than
>> just "snapshot create" and "snapshot delete" commands are needed.
>>
>> One idea is to automatically take a snapshot when each processes
>> starts, and to keep it until its parent process terminates. This means
>> that from the command line I can rollback to between any commands in
>> that shell's history. Perhaps applications that suffer an error could
>> choose to revert all their changes on termination.
>
> There are a lot of different factors in play here. First, once a new
> snapshot is created, additional COW runs are required for any tree
> metadata related to the snapshot.
Sure. Like many things, the user will want to weigh the benefits
against the costs. But I think that the costs are now tractable, so
it's worth considering what the benefits could be.
> Picture a directory where process A and process B are both writing.
Hmmm, the complex case. I'm not even sure what should happen in the
easy case yet. But anyway -
> Process A decides it is time to rollback some changes, but what do we do
> with process B?
Was it actually accessing the same files, or just files in the same
directory? Were both processes writing, or just one of them? In a lot
of cases, if two processes are writing to the same file, the user has
made a mistake and something bad is going to happen; so any behaviour
would be better than the current situation.
But let's back off to the simpler case without conflicts. A process
can have at least four possible kinds of isolation from other
processes. It (and its child processes) always sees data that it (and
its child processes) have written, but it may or may not see data that
is written by other processes since it started. And its writes may be
visible to other processes as they occur, or they may become visible
atomically when it terminates. At least two combinations are certainly useful:
- Most of the shell scripts that I write implicitly assume that the
files that they read don't change under their feet, and that nothing
tries to read their output until it is entirely written. They also
assume that they won't be interrupted.
- An application like a word processor is long-running, and the user
expects that it will see files written by other applications, and that
saves will appear in the filesystem.
Since we get the second behaviour by default, I imagine a wrapper
program - let's call it 'atomic' - that implements the first behaviour:
atomic(prog,args) {
cp / snapshot
chroot snapshot {
exec(prog,args)
}
if (status!=ok) {
rm snapshot
exit(status)
}
/ = merge(/,shapshot);
}
Maybe something like a setuid bit could indicate that a particular
executable wants this behaviour. Or maybe it would be best added to a
shell (a bit like 'set -e'). Or something.
merge() is the thing that doesn't exist, and the difficulty is what
should happen if it finds a conflict. Of course lots of different
behaviours can be justified in different situations. Sometimes, the
snapshot should be abandoned; the user will retry the command.
Sometimes, it might be best to save it where the user can access it,
e.g. if I wget some huge file, but accidentally do something under its
feet, it might be good to get an error message and find the file in
/tmp. In other cases, the version from the snapshot should replace the
conflicted version. But how can we specify the required behaviour in
each case?
Regards,
Phil.
More information about the Btrfs-users
mailing list