OCFS2 Reflink Test
Contents
Introduction
This document aims at scheduling the testing plan against refcount which we're currently implementing on Ocfs2,and also capture/record the testing result to trace the status during test where we can easily position a bug. In an addition,we also explain how all of the testcases will be organized and the workflow of their corresponding testing tool.
Test Limitation
The limitation defined by desgin roadmap during refcount implementation on ocfs2 should be claimed here to make a clear understanding of how can we expect our testing tool most.
MAX_DIRENTS_NUMS defined by ulimit
OCFS2_MAX_FILENAME_LEN 255
OCFS2_MAX_SLOTS 255
VFS_PATH_MAX 65536
OCFS2_MAX_BLOCKSIZE 4k
OCFS2_MAX_CLUSTERSIZE 1M
OCFS2_MAX_INLINE_SIZE BLOCKSIZE - 200
OCFS2_MIN_REFLINK_FILE_SIZE OCFS2_MAX_INLINE_SIZE + 1
OCFS2_REFCOUNT_COW_HUN 1M
OCFS2_MAX_REFCOUNT_NUM 2^32
Testcases
we concentrated on feature,stress and boundary test for kernel fs both in single and multiple nodes,and then formalize these into a testing tool package.currently, a combination tests with EA also would deserve our concern.
1.Single-node Tests
All testcases scheduled in reflink_tests_run.sh,basically these testing scenarios were provided by reflink_tests.c while some other cases built by shell.
1. Basic functional test, it includes: 1) Add refcount tree to inodes 2) set inode's refcount tree 3) Remove inodes refcount tree 4) Increment and decrement refcount 5) Cause CoW Reflink numbers and filesize are tunable here to specify workload, use write(),ftruncate(),unlink(), append etc to cause CoW into reflinked files. we also want to make sure if mmap() works fine with reflinks. How to verify: orignal and reflinked will be compared to assure the sanity of reflink operation,and what's more, original(unchanged) file also will be checked after CoWs happened to reflinks to see if it was in a original format. and we also borrow the verifying method from fill_verify_holes.c to verify random writes against reflinked inodes. 2. Random tests: Here we randomize almost all of the factors during the tests to find bugs, such as random original extent size and numbers, random writesize and offset to cause CoW etc. it also perform a random read among reflinks. 3. Concurrent tests, tests with a fixed number of processes to concurrently manipulate the reflinked inodes: 1) 1/4 child processes do reflinks to increment refount. 2) 1/4 child processes do CoW to decrement refcount . 3) 1/4 child processes do truncate and append. 4) 1/4 child processes to verify original to see if it is in a right format. 5) father do unlinks and verify original file too. 4. Boundary test: 1) Files with size 0 (for xattrs, dx, data) should not have a refcount tree. When they are truncated, the code will remove the refcount tree. 2) Files with inline data and no external xattr tree should also have no refcount tree. 3) Extents less than 1MB should be CoWd in their entirety. 4) Extents larger than 1MB should be CoWd in 1MB hunks. So if you have a 1GB extent and write a byte, the surrounding 1MB should be CoWd. You'll want to test this in the first 1MB of the 1GB extent, the last 1MB of the extent, and somewhere in the middle of the extent. 5. Combination tests with xattr. 6. Combination tests with fill_verify_holes tests incorporated. 7. Stress tests: 1) enormous refcount tree in fs, it means we have lots of reflink pairs. 2) Lots of inodes shared one refcount tree 3) Reflinks on a HUGE file(like in oraclevm), and perform writes/reads to cause CoW from original and reflinks 4) Reflinks on inodes with HUGE(size) extents 5) Reflinks on inodes with MANY(number) extents in 1MB hunk or other size 8. Bash & Tools utility tests, use dd,reflink command to test reflinks with a combination of extent size. 9.Combination test with OracleVM: Here I personally think we just need to simulate some kinds of HUGE file(like imagefile on oraclevm),then schedule tests on it, is there anything specific that we need to concern? 10. Destructive test, to fill up the volume with enormous shared reflink inodes and refcount trees.
2.Multi-nodes Tests
NOTE:To launch the multi-nodes test,need to install openmpi first,and a configuration of ssh/rsh passwordless access also needed in advance.
1. Test lock contention for shared refcount tree among nodes:
Such as one node decrement the refcount, while one increase the refcount. each node keep a series of shared inodes, then do CoW/read concurrently among nodes.
2. Destructive & Recovery Test:
The rule should be that a reflinked inode does not appear in the filesystem until it's completely constructed. So we need a couple node-kill tests. What I'd really like is two tests, but I don't think we can do them only from userspace:
1). Kill a node while reflink is busy refcounting the source inode. When the other nodes recover, the file should still be valid and in some state of refcounting.
2). Kill a node while reflink is building the target inode in the orphan dir. The other nodes should remove it from the orphan dir, as if it never existed.
2. Storage assisted snapshot test
It by now was in a TO-DO list.
Testing Status
Kernel/Patches |
Arch/Nodes |
Ocfs2 Options |
Testing Report |
Ocfs2-Tools |
Testing Tool |
Date |
Coverage |
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Basic refcount tests: |
Latest Src tree from refcount branch |
3/12/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Basic refcount tests: |
Latest Src tree from refcount branch |
3/13/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Multi-nodes refcount tests: |
Latest Src tree from refcount branch |
3/16/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Basic xattr refcount tests: |
Latest Src tree from refcount branch |
3/17/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Verify Bugs: 1084,1085,1086,1087 |
Latest Src tree from refcount branch |
3/18/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdb1 |
Multi-nodes Testcases Development: A V1 single-node testing tool completed |
Latest Src tree from refcount branch |
3/19/2009 |
||
Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme |
x86_64/2 Nodes |
/dev/sdf1 |
Multi-nodes Testcases Development: A V2 testing tool completed |
Latest Src tree from refcount branch |
3/31/2009 |
Testing Tools
Name |
Script / Program |
Description |
Technology |
#nodes |
Useful? |
What is tested |
inline_data_test |
reflink_test.c |
These script and c program together help to coordinate reflink test on single node,including feature,boundary and stress test. |
Shell,C |
1 |
Yes |
Check basic refcount functionality on ocfs2 on single node,and stress the the test by tuning workload,also emulate a race to perform concurrent operations on refcount between multi-nodes. |
multi_reflink_test.c |
propagate processes on multiple nodes to take a race for r/w operations on one refcount tree to test if refcount support on ocfs2 behave well on concurrent events |
C,Shell,Openmpi |
2 |
Yes |