[OracleOSS] [TitleIndex] [WordIndex]

OCFS2/ReflinkTest

OCFS2 Reflink Test

Introduction

This document aims at scheduling the testing plan against refcount which we're currently implementing on Ocfs2,and also capture/record the testing result to trace the status during test where we can easily position a bug. In an addition,we also explain how all of the testcases will be organized and the workflow of their corresponding testing tool.

Test Limitation

The limitation defined by desgin roadmap during refcount implementation on ocfs2 should be claimed here to make a clear understanding of how can we expect our testing tool most.

MAX_DIRENTS_NUMS                 defined by ulimit
OCFS2_MAX_FILENAME_LEN        255
OCFS2_MAX_SLOTS                    255
VFS_PATH_MAX                         65536
OCFS2_MAX_BLOCKSIZE              4k
OCFS2_MAX_CLUSTERSIZE          1M
OCFS2_MAX_INLINE_SIZE            BLOCKSIZE - 200
OCFS2_MIN_REFLINK_FILE_SIZE   OCFS2_MAX_INLINE_SIZE + 1
OCFS2_REFCOUNT_COW_HUN      1M
OCFS2_MAX_REFCOUNT_NUM       2^32

Testcases

we concentrated on feature,stress and boundary test for kernel fs both in single and multiple nodes,and then formalize these into a testing tool package.currently, a combination tests with EA also would deserve our concern.

1.Single-node Tests

All testcases scheduled in reflink_tests_run.sh,basically these testing scenarios were provided by reflink_tests.c while some other cases built by shell.

1. Basic functional test, it includes:

    1) Add refcount tree to inodes
    2) set inode's refcount tree
    3) Remove inodes refcount tree
    4) Increment and decrement refcount
    5) Cause CoW

    Reflink numbers and filesize are tunable here to specify workload, use write(),ftruncate(),unlink(), append etc to cause CoW into reflinked files. we also want to make sure if mmap() works fine with reflinks.

    How to verify: orignal and reflinked will be compared to assure the sanity of reflink operation,and what's more, original(unchanged) file also will be checked after CoWs happened to reflinks to see if it was in a original  format. and we also borrow the verifying method from fill_verify_holes.c to verify random writes against reflinked inodes.

2. Random tests:

    Here we randomize almost all of the factors during the tests to find bugs, such as random original extent size and numbers, random writesize and offset to cause CoW etc. it also perform a random read among reflinks.

3. Concurrent tests, tests with a fixed number of processes to concurrently manipulate the reflinked inodes:

   1) 1/4 child processes do reflinks to increment refount.
   2) 1/4 child processes do CoW to decrement refcount .
   3) 1/4 child processes do truncate and append.
   4) 1/4 child processes to verify original to see if it is in a right format.

   5) father do unlinks and verify original file too.

4. Boundary test:

  1) Files with size 0 (for xattrs, dx, data) should not have a refcount tree. When they are truncated, the code will remove the refcount tree. 
  2) Files with inline data and no external xattr tree should also have no refcount tree. 
  3) Extents less than 1MB should be CoWd in their entirety.  
  4) Extents larger than 1MB should be CoWd in 1MB hunks.  So if you have a 1GB extent and write a byte, the surrounding 1MB should be CoWd.
      You'll want to test this in the first 1MB of the 1GB extent, the last 1MB of the extent, and somewhere in the middle of the extent.

5. Combination tests with xattr.

6. Combination tests with fill_verify_holes tests incorporated.

7. Stress tests:

    1) enormous refcount tree in fs, it means we have lots of reflink pairs.
    2) Lots of inodes shared one refcount tree
    3) Reflinks on a HUGE file(like in oraclevm), and perform writes/reads to cause CoW from original and reflinks
    4) Reflinks on inodes with HUGE(size) extents
    5) Reflinks on inodes with MANY(number) extents in 1MB hunk or other size

8. Bash & Tools utility tests, use dd,reflink command to test reflinks with a combination of extent size.

9.Combination test with OracleVM:

  Here I personally think we just need to simulate some kinds of HUGE file(like imagefile on oraclevm),then schedule tests on it, is there anything specific that we need to concern?

10. Destructive test, to fill up the volume with enormous shared reflink inodes and refcount trees.

 

2.Multi-nodes Tests

NOTE:To launch the multi-nodes test,need to install openmpi first,and a configuration of ssh/rsh passwordless access also needed in advance.

1. Test lock contention for shared refcount tree among nodes:

    Such as one node decrement the refcount, while one increase the refcount. each node keep a series of shared inodes, then do CoW/read concurrently among nodes.
           

2. Destructive & Recovery Test:

    The rule should be that a reflinked inode does not appear in the filesystem until it's completely constructed.  So we need a couple node-kill tests. What I'd really like is two tests, but I don't think we can do them only from userspace:

   1). Kill a node while reflink is busy refcounting the source inode. When the other nodes recover, the file should still be valid and in some state of refcounting.

   2). Kill a node while reflink is building the target inode in the orphan dir.  The other nodes should remove it from the orphan dir, as if it never existed.


2. Storage assisted snapshot test

It by now was in a TO-DO list.

Testing Status

Kernel/Patches

Arch/Nodes

Ocfs2 Options

Testing Report

Ocfs2-Tools

Testing Tool

Date

Coverage

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Basic refcount tests:
Found Bug 1084

Latest Src tree from refcount branch

reflink_tests_0312.tgz

3/12/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Basic refcount tests:
Found Bug 1085

Latest Src tree from refcount branch

reflink_tests_0313.tgz

3/13/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Multi-nodes refcount tests:
Found Bug 1086

Latest Src tree from refcount branch

reflink_tests_0316.tgz

3/16/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Basic xattr refcount tests:
Found Bug 1087

Latest Src tree from refcount branch

reflink_tests_0316.tgz

3/17/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Verify Bugs: 1084,1085,1086,1087
Single-node Testcases Development: A V1 single-node testing tool completed
Single-node Testing: 1088 1089 1091

Latest Src tree from refcount branch

reflink_tests_0318.tgz

3/18/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.12

x86_64/2 Nodes

/dev/sdb1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Multi-nodes Testcases Development: A V1 single-node testing tool completed
Single-node Testing: 1092

Latest Src tree from refcount branch

reflink_tests_0319.tgz

3/19/2009

Joel's git tree(git://oss.oracle.com/git/jlbec/linux-2.6.git cacheme
last commitid:196a0744f75e0b804aa523acda248c7e4a04f090: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
reflink.patch.3.27

x86_64/2 Nodes

/dev/sdf1
Blocksize=512/1k/4k
Clustersize=4k/32k/1M
Slots=4
Nodes=2

Multi-nodes Testcases Development: A V2 testing tool completed
Single-node Testing: 1096

Latest Src tree from refcount branch

reflink_tests_0331.tgz

3/31/2009

Testing Tools

Name

Script / Program

Description

Technology

#nodes

Useful?

What is tested

inline_data_test

reflink_test.c
reflink_test_run.sh

These script and c program together help to coordinate reflink test on single node,including feature,boundary and stress test.

Shell,C

1

Yes

Check basic refcount functionality on ocfs2 on single node,and stress the the test by tuning workload,also emulate a race to perform concurrent operations on refcount between multi-nodes.

multi_reflink_test.c
multi_reflink_test_run.sh

propagate processes on multiple nodes to take a race for r/w operations on one refcount tree to test if refcount support on ocfs2 behave well on concurrent events

C,Shell,Openmpi

2

Yes


2011-12-23 01:01