[Ocfs2-tools-devel] [PATCH 3/3] Fix superblock ECC

Sunil Mushran sunil.mushran at oracle.com
Mon Jun 20 11:20:49 PDT 2011


On 06/20/2011 10:32 AM, Goldwyn Rodrigues wrote:
> Fix the ECC of the superblock by re-calculating if incorrect and writing
> it out.
>
> Signed-off-by: Goldwyn Rodrigues<rgoldwyn at suse.de>
> ---
>   fsck.ocfs2/fsck.c                 |   27 +++++++++++++++++++++++++--
>   fsck.ocfs2/fsck.ocfs2.checks.8.in |    9 +++++++++
>   2 files changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/fsck.ocfs2/fsck.c b/fsck.ocfs2/fsck.c
> index ea072c6..6568e27 100644
> --- a/fsck.ocfs2/fsck.c
> +++ b/fsck.ocfs2/fsck.c
> @@ -79,6 +79,7 @@ static o2fsck_state _ost;
>   static int cluster_locked = 0;
>
>   static void mark_magical_clusters(o2fsck_state *ost);
> +static errcode_t write_out_superblock(o2fsck_state *ost);
>
>   static void handle_signal(int sig)
>   {
> @@ -231,19 +232,41 @@ errcode_t o2fsck_state_reinit(ocfs2_filesys *fs,
> o2fsck_state *ost)
>
>   static errcode_t check_superblock(o2fsck_state *ost)
>   {
> -	struct ocfs2_dinode *di = ost->ost_fs->fs_super;
> -	struct ocfs2_super_block *sb = OCFS2_RAW_SB(di);
> +	char *blk;
> +	struct ocfs2_dinode *di;
> +	ocfs2_filesys *fs = ost->ost_fs;
> +	struct ocfs2_super_block *sb = OCFS2_RAW_SB(fs->fs_super);
>   	errcode_t ret = 0;
>
> +	ret = ocfs2_malloc_block(fs->fs_io,&blk);
> +	if (ret)
> +		return ret;
> +	memcpy(blk, (char *)fs->fs_super, fs->fs_blocksize);
> +	di = (struct ocfs2_dinode *)blk;
> +
>   	if (sb->s_max_slots == 0) {
>   		printf("The superblock max_slots field is set to 0.\n");
>   		ret = OCFS2_ET_CORRUPT_SUPERBLOCK;
>   	}
>
> +	if (ocfs2_meta_ecc(OCFS2_RAW_SB(ost->ost_fs->fs_super))) {
> +		ret = ocfs2_block_check_validate(
> +				(char *)di, fs->fs_blocksize,&di->i_check);
> +		if ((ret)&&  prompt(ost, PN, PR_INVALID_SUPER_ECC,
> +					"Superblock has invalid ECC. Fix?")) {
> +			ret = write_out_superblock(ost);
> +			if (ret)
> +				com_err(whoami, ret,
> +					"while writing superblock\n");
> +		} else /* Just in case hamming fixed anything, copy back*/
> +			memcpy(fs->fs_super, di, fs->fs_blocksize);
> +	}
> +
>   	ost->ost_fs_generation = di->i_fs_generation;
>
>   	/* XXX do we want checking for different revisions of ocfs2? */
>
> +	ocfs2_free(&blk);
>   	return ret;
>   }
>
> diff --git a/fsck.ocfs2/fsck.ocfs2.checks.8.in
> b/fsck.ocfs2/fsck.ocfs2.checks.8.in
> index e706ea5..e4f75f1 100644
> --- a/fsck.ocfs2/fsck.ocfs2.checks.8.in
> +++ b/fsck.ocfs2/fsck.ocfs2.checks.8.in
> @@ -1137,6 +1137,15 @@ index entry will cause lookups on this name to fail.
>
>   Answering yes will rebuild the directory index, restoring the missing entry.
>
> +.SS "INVALID_SUPER_ECC"
> +The superblock has incorrect Error Correcting Code (ECC). ECC is capable of
> +correcting corruption upto 1 bit per block.Any corruptions higher that this
> +may indicate corruption. In this case the filesystem reports an error with
> +the read operation.
> +
> +Answering yes will recalculate the ECC and write the superblock with the
> +calculated ECC.
> +
>   .SH "SEE ALSO"
>   .BR fsck.ocfs2(8)
>

Remember we have backup superblocks too. So we'll need to fix them
all. But before that, we should see if we can use one of the back-ups.

So read superblock. If ecc check fails, read all backups. Consider
the backups usable only if all of them are 100% replicas of each other.
If even one is bad, they can then be ignored and we can assume the
main super to be the best one available. Continue using that and at
the end refresh all the backups with the updated main super.

If all backups match, and their ecc matches, then we still have to
consider developer error due to which backups were not updated.
So then we should compare the main super with a backup. If the
differences are what we allow tunefs to update (numslots, volume size,
uuid, features, label) then we consider main super to be the best one
available. Proceed with that and update the backups at the end.

If the main super has other differences, then use a backup.

And can you call it SUPERBLOCK_ECC_INVALID.

We'll need to add INODE_ECC_INVALID, EXTENT_ECC_INVALID, etc. also.



More information about the Ocfs2-tools-devel mailing list