[OracleOSS] [TitleIndex] [WordIndex]

TaoMa/TunefsUpdate

TUNEFS UPDATE FOR SPARSE FILES

Owner: TaoMa

OVERVIEW

The disk layout for ocfs2 volume will be changed for sparse files in 1.4, so for some boxes which have ocfs2-1.2.* installed, it is impossible for them to access the new-formatted devices. On the other hand, for some old-formatted volume, we could turn on this feature so that the future cluster allocation can use it to save space.

Another new usage for tunefs.ocfs2 is to list all the sparse files in a volume. This is useful before a user want to disable "sparse_files". It will provide the user a quick way to list all such files. So he might just decide that some files are not worth it, like coredumps. And he may choose to remove them to save spaces.

INTERFACE OVERVIEW

    tunefs.ocfs2  --fs-features=[no]sparse,... device

So it is not only used for sparse file support, but also helpful for future feature modifications.

if "nosparse" is used in the feature string, tunefs.ocfs2 will fill all the holes in all the sparse files. When the work is done, the sparse_file flag will be removed from this device and it should be accessible to those old boxes with 1.2.* installed. There is also one thing that has to be mentioned. Since unwritten extent is based on sparse file, this flag will also be removed and all the unwritten contents will be zeroed.

if "sparse" is used in the feature string, tunefs.ocfs2 will do some operations and add the sparse-file flags to the volume, now if this new volume is used in new ocfs2-1.4 above kernel, sparse file will be created accordingly from then on.

Long options "--list-sparse" will also be added to tunefs.ocfs2 which will list all the sparse files in a volume.

    tunefs.ocfs2  --list-sparse device

WORK FLOW

"--fs-features=nosparse"

  1. Check the flags of the volume to see whether this volume has sparse_files and unwritten extent supported.
  2. Iterate all the inodes and calculate all the cluster numbers we need and record them in a list for future use(If unwritten extent is supported, record all the unwritten extents also). Maybe some new extent blocks are needed, so the total number of extent blocks should also be counted in.
  3. Get the total free cluster number from "//global_bitmap" to check whether the volume has enough space.
  4. Give out the calculated result and ask the user whether to continue.
  5. If yes, empty the unwritten extents and mark them all as written according to the information we stored in step 2.
  6. If yes, do the insertion according to the information we stored in step 2.
  7. Remove the "sparse_files" flag from the super block and update the backup ones.

"--fs-features=sparse"

  1. Check the flags of the volume to see whether this volume has sparse_files supported.
  2. Iterate over all inodes and zero the area between i_size and i_clusters. This is because the sparse file support doesn't zero during extend any more.
  3. Set the "sparse_files" flag on the super block and update the backup ones.

"--list-sparse"

  1. Iterate all the directories from "/".
  2. If there is a sparse file, iterate it and find its missing cluster counts.
  3. Iterate all the orphan directories and output all the holes in the orphan files.
  4. Output the total free counts for the volume.

TECHNICAL OVERVIEW

Iteration of all the inodes

iterate_all_inode

ocfs2_open_inode_scan(fs, &scan);
for(;;) {
        ret = ocfs2_get_next_inode(scan, &blkno, buf);
        ocfs2_swap_inode_to_cpu(di);
        iterate_inode(blkno);
}

The implementation can take a reference from the function o2fsck_pass1.

Iteration of all the Directories

iterate_dir(uint64_t ino)

while(dir_entry) {
    if dir_entry is "." or ".." continue;
    if dir_entry is directory
        iterate_dir entry_ino;
    if dir_enty is a file
        iterate_inode;
}

The implementation can take a reference from the function pass2_dir_block_iterate which also iterates a directory. As for the iteration of all the orphan_dirs, we can take a reference from replay_orphan_dir.

iteration of an specific file

iterate_inode(uint64_t ino)

for(i=0;i<next_free_rec;i++) {
     if rec is a extent block
         iterate_extent_block
     else if there is a hole
         record the hole's start_pos and length;
}
calculate the new extent blocks if needed;
/* To calculate the worst scenario, we will assume each
 * cluster is a separate extent record and figure out
 * how many extent blocks would be needed to fill holes.
 */

This iteration can take a reference from the function duplicate_extent_block which also iterate all the extent block in a inode.

Insertion of clusters

It is simple if we stores all the hole information during the iteration. Just call ocfs2_new_clusters, empty all the contents in the clusters, and use ocfs2_insert_extent to fill all the holes.


2011-12-23 01:01