BACKUP SUPERBLOCK
Owner: TaoMa (Completed as of r1284)
Introduction
The idea behind this feature is to store a copy of the super block in a known location so that it can be used in case someone accidentally overwrites the real superblock. This is important as the superblock stores few pieces of static or "mostly static" information that can be critical in recovering data. Some examples include blocksize, clustersize, sysdir location, rootdir location, file system generation, number of slots.
But before implementating this feature we need to decide on the "known" location or locations for the backup superblock. The location needs to match the following list of conditions:
- It must be a fixed offset from the start of the disk.
- It must not be dependent on the size of the disk though we could specify a minimum disk size to provide this feature (very small file systems don't have to have backup superblocks)
- Its location should preferably not clash with a journal file so as to allow tunefs.ocfs2 to add this feature retroactively in existing volumes. Assuming 16 slots with 60MB journal size gives us around 1G. Meaning look for holes after 1G.
- The super block locations should be fixed and easily computable given a small set of values.
- We want the user to have to remember as few items of information as possible. For example, we could just have fixed locations for each supported block size. (this is how ext3 handles it)
- Its location must not clash with the locations of main bitmap group descriptors.
File systems which have a set of backup super blocks should be marked with a compat flag. This way the mounted file system will know whether to write out new versions of the backup super block at the end of an online resize.
Backup Superblock Location
The backup superblock's location should not clash with the group descriptor as that is one of the two file system meta data blocks that have a fixed location based upon the cluster/block sizes. The other fixed block is the superblock.
After computing the group descriptor offsets for all block/cluster size combinations, we notice that there is no clash at the 1G, 4G, 16G, 64G, 256G and 1T byte offsets. Listed below is a table that shows the closest group descriptor offsets. Byte Offset Cluster Block
1071644672 c=4K b=512
1073741824 1G
1086324736 c=4K b=512 ...
4290772992 c=8K b=2K
4294967296 4G
4301258752 c=4K b=512 ...
17175674880 c=8K b=512
17179869184 16G
17190354944 c=4K b=512
...
68717379584 c=4K b=512
68719476736 64G
68732059648 c=4K b=512
...
274873712640 c=8K b=1K
274877906944 256G
274884198400 c=4K b=512
...
1099507433472 c=8K b=512
1099511627776 1T
1099522113536 c=4K b=512
There are much benefit for these locations:
- mkfs.ocfs2 is easy to find the location.
- It will obviously not occupy the group descriptor block.
- It is easy for future search since they are hard-coded, and it don't take much time to go through the whole device.
- Since its location is unrelated to slot num, we don't need to recalculate it when we change the slot num by tunefs.ocfs2.
- There are many copies of the back-up superblock, so that we can verify them easily.
TODO List
Here is the list of tools that need modifications.
- mkfs.ocfs2: This tool will write out copies of the backup superblocks.
- The backup superblock locations will be reserved first and written after the super block information is ready afterwards.
- The compat flag indicating the existence of backup super blocks will be set on the super block.
- A '--no-backup-super' option will be added which doesn't allocate the backup super block in the volume.
Note: It will be better if one were to add the backup superblocks after the initial superblock has been flushed to disk. That way, the code to stamp the backup superblock can be added to libocfs2 can then be shared with tunefs. For more, refer to the mkfs code that adds the journals. The true superblock with the appropriate compat flag will be flushed only after all the backups have been flushed at their location.
- tunefs.ocfs2: This tool will allow users to retroactively add this feature on existing volumes.
- When superblock information is changed, we need to change backup superblock correspondingly.
- Add an option"--backup-super". It will backup super block when it is an old ocfs2 volume which doesn't have backup super blocks. As in, if the clusters are in use, it will fail. We can then use debugfs.ocfs2's "icheck block#" to identify the owner and ask the user to delete (after backing up) the files before attempting to tunefs.ocfs2 again.
- tunefs also needs to be able to adding backup superblocks during resize.
- fsck.ocfs2: This tool will allow users to point to a backup superblock to perform recovery.
- Add '-r' option for the user to recover the superblock. The User can offer a specified offset or just let fsck search the 'known location' in the volume and find the right backup superblock.
- As the backup blocks also reside in the global_bitmap and don't belong to any inode, so here fsck.ocfs2 must acknowledge their existence and take them into consideration of any bitmap checking to avoid errors and bitmap recovery.
- debugfs.ocfs2: This tool will allow users to point to a backup superblock and shows the information stored in backup superblock.
- Add '-s' option for the user to opening a fs using a backup superblock. The -s should accept 1, 2, 3, 4, 5, 6 as arguments. 1 means 1G, 2 4G, 3 16G, etc.
- the command "open" need to be enhanced to support the same function.
Some consts and function prototypes
/* * backup superblock flag is used to indicate that this volume * has backup superblocks. */ #define OCFS2_FEATURE_COMPAT_BACKUP_SB 0x0001 /* The byte offset of the first backup block will be 1G. * The following will be 4G, 16G, 64G, 256G and 1T. */ #define OCFS2_BACKUP_SB_START 1 << 30 /* the max backup superblock nums */ #define OCFS2_MAX_BACKUP_SUPERBLOCKS 6 static inline uint64_t ocfs2_backup_super_blkno(int blocksize, int index) { uint64_t offset = OCFS2_BACKUP_SB_START; if (index >= 0 && index < OCFS2_MAX_BACKUP_SUPERBLOCKS) { offset <<= (2 * index); offset /= blocksize; return offset; } return 0; } /* write the superblock at the specific block. */ errcode_t ocfs2_write_backup_super(ocfs2_filesys *fs, uint64_t blkno); /* Get the blkno according to the file system info. * The unused ones, depending on the volume size, are zeroed. * Return the length of the block array. */ int ocfs2_get_backup_super_offset(ocfs2_filesys *fs, uint64_t *blocks, size_t len); /* This function will get the superblock pointed to by fs and copy it to * the blocks. But first it will ensure all the appropriate clusters are free. * If not, it will error out with ENOSPC. If free, it will set bits for all * the clusters, zero the clusters and write the backup sb. * In case of updating, it will override the backup blocks with the newest * superblock information. */ errcode_t ocfs2_set_backup_super(ocfs2_filesys *fs, uint64_t *blocks, size_t len); /* Refresh the backup superblock inoformation. */ errcode_t ocfs2_refresh_backup_super(ocfs2_filesys *fs, uint64_t *blocks, size_t len); errcode_t ocfs2_read_backup_super(ocfs2_filesys *fs, int backup, char *sbbuf);