[Ocfs2-tools-devel] [patch 3/7] Adds slot remove mechanism in tunefs.ocfs2.

Joel Becker Joel.Becker at oracle.com
Tue Jun 12 17:59:59 PDT 2007


On Tue, Jun 12, 2007 at 02:44:02PM +0800, tao.ma at oracle.com wrote:
> +/*
> + * This function will iterate the chain_rec and do the following modifications:
> + * 1. modify  Sub Alloc Slot in extent block/inodes accordingly.
> + * 2. change the GROUP_PARENT according to its future owner.
> + * 3. link the chain to the new slot files.
> + */
> +static errcode_t move_chain_rec(ocfs2_filesys *fs, struct relink_ctxt *ctxt)

Tao,
	Looking at this function, there are a couple of things that
worry me.
	First, you walk an entire chain, changing all inodes or extent
blocks.  Then, you move each group.  Like so:

 1  for each group:
 2      for each set bit:
 3          modify suballoc slot
 4      push group onto stack
 5  while stack not empty:
 6      pop group from stack
 7      link group to new inode

The thing that worries me is that you change every inode or extent block
before you start moving the groups.  If you crash between lines 4 and 5,
every metadata block in the chain has an incorrect suballoc slot.
	Now, fsck should be able to fix this.  However, if you modified
the metadata blocks right before you move the group, you only have one
group incorrect at a time:

 1  for each group:
 2      push group onto stack
 3  while stack not empty:
 4      pop group from stack
 5      for each set bit:
 6         modify suballoc slot
 7      link group to new inode

If you crash between 2 & 3, nothing has changed.  If you crash between 6
& 7, only one group is incorrect.
	In this fashion, fsck would only have to fix one group of
metadata blocks, instead of N.  It's faster.

> +		/* Record the group in the relink_ctxt.
> +		 *
> +		 * We record the group in a reverse order, so the first group
> +		 * will be at the end of the group list. This is useful for
> +		 * fsck.ocfs2 when any error happens during the move of groups
> +		 * and we can safely move the group also.
> +		 */
> +		ret = ocfs2_malloc0(sizeof(struct moved_group), &group);
> +		if (ret)
> +			goto bail;
> +
> +		group->blkno = gd_blkno;
> +		group->next = ctxt->groups;
> +		ctxt->groups = group;
> +
> +		gd_blkno = gd->bg_next_group;

	Here's my second nitpick.  Why not save off the group block here? 

                ret = ocfs2_malloc_block(fs, &group->block);
                if (ret)
                          goto bail;
                memcpy(group->block, ctxt->gd_buf, fs->fs_blocksize);

Then you can avoid re-reading it in move_groups().  This should improve
the performance of chain relocation by 10-30%.

> +	}
> +
> +	/* move the chain to the new slots. */
> +	ret = move_groups(fs, ctxt);
> +
> +bail:
> +	group = ctxt->groups;
> +	while (group) {
> +		group_next = group->next;

                if (group->block)
                        ocfs2_free(&group->block);

> +		ocfs2_free(&group);
> +		group = group_next;
> +	}
> +	ctxt->groups = NULL;
> +	return ret;
> +}

Joel

-- 

"I'm drifting and drifting
 Just like a ship out on the sea.
 Cause I ain't got nobody, baby,
 In this world to care for me."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-tools-devel mailing list