[Ocfs2-devel] [PATCH 0/8] Ocfs2: Online defragmentaion V1.

Wengang Wang wen.gang.wang at oracle.com
Tue Dec 28 04:05:20 PST 2010


Hi Guy,

I like it. Having is definitely better than no.
I will play with it when I am free :)

thanks,
wengang.
On 10-12-28 19:40, Tristan Ye wrote:
> Hi All,
> 
> 	It's a quite rough patches series v1 for online defragmentation on OCFS2, it's
> workable anyway, may look ugly though;) The essence of online file defragmentation is
> extents moving like what btrfs and ext4 were doing, adding 'OCFS2_IOC_MOVE_EXT' ioctl
> to ocfs2 allows two strategies upon defragmentation:
> 
> 1. simple-defragmentation-in-kernl, which means kernel will be responsible for
>    claiming new clusters, and packing the defragmented extents according to a
>    user-specified threshold.
> 
> 2. simple-extents moving, in this case, userspace play much more important role
>    when doing defragmentation, it needs to specify the new physical blk_offset
>    where extents will be moved, kernel itself will not do anything more than
>    moving the extents per requested, maybe kernel also needs to manage to
>    probe/validate the new_blkoffset to guarantee enough free space around there.
> 
> Above two operations using the same OCFS2_IOC_MOVE_EXT:
> -------------------------------------------------------------------------------
> #define OCFS2_MOVE_EXT_FL_AUTO_DEFRAG   (0x00000001)    /* Kernel manages to
>                                                            claim new clusters
>                                                            as the goal place
>                                                            for extents moving */
> #define OCFS2_MOVE_EXT_FL_COMPLETE      (0x00000002)    /* Move or defragmenation
>                                                            completely gets done.
>                                                          */
> struct ocfs2_move_extents {
> /* All values are in bytes */
>         /* in */
>         __u64 me_start;         /* Virtual start in the file to move */
>         __u64 me_len;           /* Length of the extents to be moved */
>         __u64 me_goal;          /* Physical offset of the goal */
>         __u64 me_thresh;        /* Maximum distance from goal or threshold
>                                    for auto defragmentation */
>         __u64 me_flags;         /* flags for the operation:
>                                  * - auto defragmentation.
>                                  * - refcount,xattr cases.
>                                  */
> 
>         /* out */
>         __u64 me_moved_len;     /* moved length, are we completely done? */
>         __u64 me_new_offset;    /* Resulting physical location */
>         __u32 me_reserved[3];   /* reserved for futhure */
> };
> -------------------------------------------------------------------------------
> 
> 	Current V1 patches set will be focusing mostly on strategy #1 though, since #2
> strategy is still there under discussion.
> 
> 	Following are some interesting data gathered from simple tests:
> 
> 1. Performance improvement gained on I/O reads:
> -------------------------------------------------------------------------------
> * Before defragmentation *
> 
> [root at ocfs2-box4 ~]# sync
> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches 
> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
> 640000+0 records in
> 640000+0 records out
> 327680000 bytes (328 MB) copied, 19.9351 s, 16.4 MB/s
> 
> real	0m19.954s
> user	0m0.246s
> sys	0m1.111s
> 
> * Do defragmentation *
> 
> [root at ocfs2-box4 defrag]# ./defrag -s 0 -l 293601280  -t 3145728 /storage/testfile-1
> 
> * After defragmentation *
> 
> [root at ocfs2-box4 ~]# sync
> [root at ocfs2-box4 ~]# echo 3>/proc/sys/vm/drop_caches
> [root at ocfs2-box4 ~]# time dd if=/storage/testfile-1 of=/dev/null
> 640000+0 records in
> 640000+0 records out
> 327680000 bytes (328 MB) copied, 6.79885 s, 48.2 MB/s
> 
> real	0m6.969s
> user	0m0.209s
> sys	0m1.063s
> -------------------------------------------------------------------------------
> 
> 
> 2. Extent tree layout via debugfs.ocfs2:
> -------------------------------------------------------------------------------
> * Before defragmentation *
> 
>         Tree Depth: 1   Count: 243   Next Free Rec: 8
>         ## Offset        Clusters       Block#
>         0  0             1173           86561
>         1  1173          1173           84527
>         2  2346          1151           81468
>         3  3497          1173           76362
>         4  4670          1173           74328
>         5  5843          1172           66150
>         6  7015          1460           70260
>         7  8475          662            87680
>         SubAlloc Bit: 1   SubAlloc Slot: 0
>         Blknum: 86561   Next Leaf: 84527
>         CRC32: abf06a6b   ECC: 44bc
>         Tree Depth: 0   Count: 252   Next Free Rec: 252
>         ## Offset        Clusters       Block#          Flags
>         0  1             16             516104          0x0
>         1  17            1              554632          0x0
>         2  18            7              560144          0x0
>         3  25            1              565960          0x0
>         4  26            1              572632          0x
> 	...
> 	/* around 1700 extent records were hidden there */
> 	...
> 	138 9131          1              258968          0x0
>         139 9132          1              259568          0x0
>         140 9133          1              260168          0x0
>         141 9134          1              260768          0x0
>         142 9135          1              261368          0x0
>         143 9136          1              261968          0x0
> 
> * After defragmentation *
> 
>       Tree Depth: 1   Count: 243   Next Free Rec: 1
> 	## Offset        Clusters       Block#
> 	0  0             9137           66081
> 	SubAlloc Bit: 1   SubAlloc Slot: 0
> 	Blknum: 66081   Next Leaf: 0
> 	CRC32: 22897d34   ECC: 0619
> 	Tree Depth: 0   Count: 252   Next Free Rec: 6
> 	## Offset        Clusters       Block#          Flags
> 	0  1             1600           4412936         0x0 
> 	1  1601          1595           20669448        0x0 
> 	2  3196          1600           9358856         0x0 
> 	3  4796          1404           14516232        0x0 
> 	4  6200          1600           21627400        0x0 
> 	5  7800          1337           7483400         0x0 
> -------------------------------------------------------------------------------
> 
> 
> TO-DO:
> 
> 1. Complete strategy #2
> 2. Adding refcount/xattr/unwritten_extents support.
> 3. Free space defragmentation.
> 
> 
> Go to http://oss.oracle.com/osswiki/OCFS2/DesignDocs/OnlineDefrag for more details.
> 
> 
> Tristan.
> 
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel



More information about the Ocfs2-devel mailing list