[Ocfs2-devel] [PATCH 0/8] Ocfs2: Online defragmentaion V1.

Tristan Ye tristan.ye at oracle.com
Tue Dec 28 07:44:40 PST 2010


On 12/28/2010 08:05 PM, Wengang Wang wrote:
> Hi Guy,
>
> I like it. Having is definitely better than no.
> I will play with it when I am free :)

Hi wengang,

     Very cool to hear you're showing interest on it;-)

Actually the motivation at the very beginning was inspired by your 
original discussion on wikipage

Thanks a lot;)

Tristan



>
> thanks,
> wengang.
> On 10-12-28 19:40, Tristan Ye wrote:
>> Hi All,
>>
>> 	It's a quite rough patches series v1 for online defragmentation on OCFS2, it's
>> workable anyway, may look ugly though;) The essence of online file defragmentation is
>> extents moving like what btrfs and ext4 were doing, adding 'OCFS2_IOC_MOVE_EXT' ioctl
>> to ocfs2 allows two strategies upon defragmentation:
>>
>> 1. simple-defragmentation-in-kernl, which means kernel will be responsible for
>>     claiming new clusters, and packing the defragmented extents according to a
>>     user-specified threshold.
>>
>> 2. simple-extents moving, in this case, userspace play much more important role
>>     when doing defragmentation, it needs to specify the new physical blk_offset
>>     where extents will be moved, kernel itself will not do anything more than
>>     moving the extents per requested, maybe kernel also needs to manage to
>>     probe/validate the new_blkoffset to guarantee enough free space around there.
>>
>> Above two operations using the same OCFS2_IOC_MOVE_EXT:
>> -------------------------------------------------------------------------------
>> #define OCFS2_MOVE_EXT_FL_AUTO_DEFRAG   (0x00000001)    /* Kernel manages to
>>                                                             claim new clusters
>>                                                             as the goal place
>>                                                             for extents moving */
>> #define OCFS2_MOVE_EXT_FL_COMPLETE      (0x00000002)    /* Move or defragmenation
>>                                                             completely gets done.
>>                                                           */
>> struct ocfs2_move_extents {
>> /* All values are in bytes */
>>          /* in */
>>          __u64 me_start;         /* Virtual start in the file to move */
>>          __u64 me_len;           /* Length of the extents to be moved */
>>          __u64 me_goal;          /* Physical offset of the goal */
>>          __u64 me_thresh;        /* Maximum distance from goal or threshold
>>                                     for auto defragmentation */
>>          __u64 me_flags;         /* flags for the operation:
>>                                   * - auto defragmentation.
>>                                   * - refcount,xattr cases.
>>                                   */
>>
>>          /* out */
>>          __u64 me_moved_len;     /* moved length, are we completely done? */
>>          __u64 me_new_offset;    /* Resulting physical location */
>>          __u32 me_reserved[3];   /* reserved for futhure */
>> };
>> -------------------------------------------------------------------------------
>>
>> 	Current V1 patches set will be focusing mostly on strategy #1 though, since #2
>> strategy is still there under discussion.
>>
>> 	Following are some interesting data gathered from simple tests:
>>
>> 1. Performance improvement gained on I/O reads:
>> -------------------------------------------------------------------------------
>> * Before defragmentation *
>>
>> [root at ocfs2-box4 ~]# sync
>> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
>> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
>> 640000+0 records in
>> 640000+0 records out
>> 327680000 bytes (328 MB) copied, 19.9351 s, 16.4 MB/s
>>
>> real	0m19.954s
>> user	0m0.246s
>> sys	0m1.111s
>>
>> * Do defragmentation *
>>
>> [root at ocfs2-box4 defrag]# ./defrag -s 0 -l 293601280  -t 3145728 /storage/testfile-1
>>
>> * After defragmentation *
>>
>> [root at ocfs2-box4 ~]# sync
>> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
>> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
>> 640000+0 records in
>> 640000+0 records out
>> 327680000 bytes (328 MB) copied, 6.79885 s, 48.2 MB/s
>>
>> real	0m6.969s
>> user	0m0.209s
>> sys	0m1.063s
>> -------------------------------------------------------------------------------
>>
>>
>> 2. Extent tree layout via debugfs.ocfs2:
>> -------------------------------------------------------------------------------
>> * Before defragmentation *
>>
>>          Tree Depth: 1   Count: 243   Next Free Rec: 8
>>          ## Offset        Clusters       Block#
>>          0  0             1173           86561
>>          1  1173          1173           84527
>>          2  2346          1151           81468
>>          3  3497          1173           76362
>>          4  4670          1173           74328
>>          5  5843          1172           66150
>>          6  7015          1460           70260
>>          7  8475          662            87680
>>          SubAlloc Bit: 1   SubAlloc Slot: 0
>>          Blknum: 86561   Next Leaf: 84527
>>          CRC32: abf06a6b   ECC: 44bc
>>          Tree Depth: 0   Count: 252   Next Free Rec: 252
>>          ## Offset        Clusters       Block#          Flags
>>          0  1             16             516104          0x0
>>          1  17            1              554632          0x0
>>          2  18            7              560144          0x0
>>          3  25            1              565960          0x0
>>          4  26            1              572632          0x
>> 	...
>> 	/* around 1700 extent records were hidden there */
>> 	...
>> 	138 9131          1              258968          0x0
>>          139 9132          1              259568          0x0
>>          140 9133          1              260168          0x0
>>          141 9134          1              260768          0x0
>>          142 9135          1              261368          0x0
>>          143 9136          1              261968          0x0
>>
>> * After defragmentation *
>>
>>        Tree Depth: 1   Count: 243   Next Free Rec: 1
>> 	## Offset        Clusters       Block#
>> 	0  0             9137           66081
>> 	SubAlloc Bit: 1   SubAlloc Slot: 0
>> 	Blknum: 66081   Next Leaf: 0
>> 	CRC32: 22897d34   ECC: 0619
>> 	Tree Depth: 0   Count: 252   Next Free Rec: 6
>> 	## Offset        Clusters       Block#          Flags
>> 	0  1             1600           4412936         0x0
>> 	1  1601          1595           20669448        0x0
>> 	2  3196          1600           9358856         0x0
>> 	3  4796          1404           14516232        0x0
>> 	4  6200          1600           21627400        0x0
>> 	5  7800          1337           7483400         0x0
>> -------------------------------------------------------------------------------
>>
>>
>> TO-DO:
>>
>> 1. Complete strategy #2
>> 2. Adding refcount/xattr/unwritten_extents support.
>> 3. Free space defragmentation.
>>
>>
>> Go to http://oss.oracle.com/osswiki/OCFS2/DesignDocs/OnlineDefrag for more details.
>>
>>
>> Tristan.
>>
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-devel




More information about the Ocfs2-devel mailing list