[Btrfs-devel] Btrfs v0.6 out (btree defragging)

Chris Mason chris.mason at oracle.com
Tue Aug 7 13:56:16 PDT 2007


Hello everyone,

I've pushed out v0.6 of the kernel module and utilities.  This has two
major changes.  First, snapshots (and old transactions) are deleted in
chunks instead of as a big single unit.  This reduces lock contention
and most importantly makes it possible to do work using a finite amount
of free space.

It's basically the first part of ENOSPC, since you can't return disk
full if snapshot deletion takes an unbounded amount of disk space.
I've done a bunch of crash testing here, but I'd appreciate any bug
reports about corruptions after a crash/machine reset.

While benchmarking things, it became clear that lots of seeks were
coming from directory items.  debug-tree output would show a big set of
contiguous leaves on disk, but the directory items were always in
random locations.  This is basically the cost of indexing, things get
inserted in strange order.

The ideal fix for this is to avoid allocating btree blocks until just
before the commit, allowing you to allocate them in big chunks that go
in tree order (delayed allocation for tree blocks).  v0.6 has a step in
that direction, which is walking the tree and reallocating btree blocks
in big chunks that go in tree order.

This can be triggered in two ways.  The first is via the same periodic
daemon that forces an occasional commit.  It happens automatically and
only does defrag on blocks that are in ram.  Most of the time this is
very effective.

Another way is to use btrfsctl -d somedir (where somedir corresponds to
a subvolume or snapshot).  This forces a defrag of all the tree blocks
in that subvolume, even if they are not currently in ram.

This defrag doesn't cover file data yet, but that's the long term goal.

-chris



More information about the Btrfs-devel mailing list