[Ocfs2-devel] [PATCH 0/4] Ocfs2 allocation reservations

Mark Fasheh mfasheh at suse.com
Tue Mar 9 18:29:04 PST 2010


The following patches comprise my latest work on enabling larger contiguous
allocations in Ocfs2 in the presence of multiple threads. The patches have
been through much more testing since my last round. At this point, I'd say
they're ready for wider consumption and of course, more review :)

Similarly to the last series, reservations only operate on the local alloc
bitmap. The code knows nothing of inodes and allocators however, so we can
extend it to the global bitmap (should the need arise) at a future date.

Changes from the last series are numerous. The biggest one however, is that
reservations (when enabled) are no longer 'advisory' and represent an actual
region of free bits in the local alloc file. The local alloc code obeys
reservations unconditionally.

The reason I made this change is because I saw a breakdown in allocation
(back to worst-case) on longer running tests, or those with many threads. 
Those tests it turned out, were exposing "corner cases" in the code where
reservations could no longer be honored due to bits having been set in the
local alloc bitmap. Better window replacement (and tracking) policy became
quite convoluted when the state of the local alloc bitmap wasn't quite
known. It is far simpler to just consult the bitmap for windows, and my
testing results showed that it worked better too.

This differs from file systems like ext4 (which I used for inspiration), but
our allocation strategy differs greatly.  Whereas ext4 may have many
different block groups in play during a multi-threaded write we only have
the single (and relatively smaller) local alloc window.  Reservations can
afford to be advisory for ext4, in Ocfs2 however we need them to be honored.


As for results, I provide one of my recent test runs on a 4k/4k file
system:

dd if=/dev/urandom of=/ocfs2/1 bs=4096 count=10000 & dd if=/dev/urandom
of=/ocfs2/2 bs=4096 count=10000 & dd if=/dev/urandom of=/ocfs2/3 bs=4096
count=10000 &


resv_level=0
Inode: 16920    % fragmented: 93.48     clusters: 10000 extents: 9348 score: 23931
Inode: 16921    % fragmented: 84.75     clusters: 10000 extents: 8475 score: 21696
Inode: 16922    % fragmented: 95.50     clusters: 10000 extents: 9550 score: 24448

resv_level=5 (defaults changed a bit, this means '128 blocks per reservation'):
Inode: 16916    % fragmented: 1.66      clusters: 10000 extents: 166 score: 425
Inode: 16917    % fragmented: 1.71      clusters: 10000 extents: 171 score: 438
Inode: 16918    % fragmented: 1.58      clusters: 10000 extents: 158 score: 404

Thanks in advance,
	--Mark



More information about the Ocfs2-devel mailing list