[Ocfs2-devel] [PATCH 0/8] Ocfs2: Online defragmentaion V1.
Wengang Wang
wen.gang.wang at oracle.com
Tue Dec 28 04:05:20 PST 2010
Hi Guy,
I like it. Having is definitely better than no.
I will play with it when I am free :)
thanks,
wengang.
On 10-12-28 19:40, Tristan Ye wrote:
> Hi All,
>
> It's a quite rough patches series v1 for online defragmentation on OCFS2, it's
> workable anyway, may look ugly though;) The essence of online file defragmentation is
> extents moving like what btrfs and ext4 were doing, adding 'OCFS2_IOC_MOVE_EXT' ioctl
> to ocfs2 allows two strategies upon defragmentation:
>
> 1. simple-defragmentation-in-kernl, which means kernel will be responsible for
> claiming new clusters, and packing the defragmented extents according to a
> user-specified threshold.
>
> 2. simple-extents moving, in this case, userspace play much more important role
> when doing defragmentation, it needs to specify the new physical blk_offset
> where extents will be moved, kernel itself will not do anything more than
> moving the extents per requested, maybe kernel also needs to manage to
> probe/validate the new_blkoffset to guarantee enough free space around there.
>
> Above two operations using the same OCFS2_IOC_MOVE_EXT:
> -------------------------------------------------------------------------------
> #define OCFS2_MOVE_EXT_FL_AUTO_DEFRAG (0x00000001) /* Kernel manages to
> claim new clusters
> as the goal place
> for extents moving */
> #define OCFS2_MOVE_EXT_FL_COMPLETE (0x00000002) /* Move or defragmenation
> completely gets done.
> */
> struct ocfs2_move_extents {
> /* All values are in bytes */
> /* in */
> __u64 me_start; /* Virtual start in the file to move */
> __u64 me_len; /* Length of the extents to be moved */
> __u64 me_goal; /* Physical offset of the goal */
> __u64 me_thresh; /* Maximum distance from goal or threshold
> for auto defragmentation */
> __u64 me_flags; /* flags for the operation:
> * - auto defragmentation.
> * - refcount,xattr cases.
> */
>
> /* out */
> __u64 me_moved_len; /* moved length, are we completely done? */
> __u64 me_new_offset; /* Resulting physical location */
> __u32 me_reserved[3]; /* reserved for futhure */
> };
> -------------------------------------------------------------------------------
>
> Current V1 patches set will be focusing mostly on strategy #1 though, since #2
> strategy is still there under discussion.
>
> Following are some interesting data gathered from simple tests:
>
> 1. Performance improvement gained on I/O reads:
> -------------------------------------------------------------------------------
> * Before defragmentation *
>
> [root at ocfs2-box4 ~]# sync
> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
> 640000+0 records in
> 640000+0 records out
> 327680000 bytes (328 MB) copied, 19.9351 s, 16.4 MB/s
>
> real 0m19.954s
> user 0m0.246s
> sys 0m1.111s
>
> * Do defragmentation *
>
> [root at ocfs2-box4 defrag]# ./defrag -s 0 -l 293601280 -t 3145728 /storage/testfile-1
>
> * After defragmentation *
>
> [root at ocfs2-box4 ~]# sync
> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
> 640000+0 records in
> 640000+0 records out
> 327680000 bytes (328 MB) copied, 6.79885 s, 48.2 MB/s
>
> real 0m6.969s
> user 0m0.209s
> sys 0m1.063s
> -------------------------------------------------------------------------------
>
>
> 2. Extent tree layout via debugfs.ocfs2:
> -------------------------------------------------------------------------------
> * Before defragmentation *
>
> Tree Depth: 1 Count: 243 Next Free Rec: 8
> ## Offset Clusters Block#
> 0 0 1173 86561
> 1 1173 1173 84527
> 2 2346 1151 81468
> 3 3497 1173 76362
> 4 4670 1173 74328
> 5 5843 1172 66150
> 6 7015 1460 70260
> 7 8475 662 87680
> SubAlloc Bit: 1 SubAlloc Slot: 0
> Blknum: 86561 Next Leaf: 84527
> CRC32: abf06a6b ECC: 44bc
> Tree Depth: 0 Count: 252 Next Free Rec: 252
> ## Offset Clusters Block# Flags
> 0 1 16 516104 0x0
> 1 17 1 554632 0x0
> 2 18 7 560144 0x0
> 3 25 1 565960 0x0
> 4 26 1 572632 0x
> ...
> /* around 1700 extent records were hidden there */
> ...
> 138 9131 1 258968 0x0
> 139 9132 1 259568 0x0
> 140 9133 1 260168 0x0
> 141 9134 1 260768 0x0
> 142 9135 1 261368 0x0
> 143 9136 1 261968 0x0
>
> * After defragmentation *
>
> Tree Depth: 1 Count: 243 Next Free Rec: 1
> ## Offset Clusters Block#
> 0 0 9137 66081
> SubAlloc Bit: 1 SubAlloc Slot: 0
> Blknum: 66081 Next Leaf: 0
> CRC32: 22897d34 ECC: 0619
> Tree Depth: 0 Count: 252 Next Free Rec: 6
> ## Offset Clusters Block# Flags
> 0 1 1600 4412936 0x0
> 1 1601 1595 20669448 0x0
> 2 3196 1600 9358856 0x0
> 3 4796 1404 14516232 0x0
> 4 6200 1600 21627400 0x0
> 5 7800 1337 7483400 0x0
> -------------------------------------------------------------------------------
>
>
> TO-DO:
>
> 1. Complete strategy #2
> 2. Adding refcount/xattr/unwritten_extents support.
> 3. Free space defragmentation.
>
>
> Go to http://oss.oracle.com/osswiki/OCFS2/DesignDocs/OnlineDefrag for more details.
>
>
> Tristan.
>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
More information about the Ocfs2-devel
mailing list