[Ocfs2-tools-devel] [PATCH 2/4] defrag.ocfs2: Pass 1: Defrag individual files and directories

Joel Becker Joel.Becker at oracle.com
Sun Jul 18 03:11:26 PDT 2010


On Tue, May 11, 2010 at 11:02:34PM -0500, Goldwyn Rodrigues wrote:
> Defragging directory -
> Allocate an extent, and copy dirents to the new extent, skipping
> holes and empty dirents. For each dirent, the dirent length
> is recalculated to optimize on space.

	On to defragging the directory.

> +static int copy_dirents(ocfs2_filesys *fs,
> +		struct ocfs2_extent_rec *rec,
> +		int tree_depth, uint32_t ccount, uint64_t ref_blkno,
> +		int ref_recno, void *private)
> +{

	To be honest, I'd much rather see you use ocfs2_link() than
hand-copy the dirents.  We already have audited code for editing these
things.  I have a proposed way to code it that I will outline below.

> +errcode_t defrag_dir(struct defrag_state *dst, struct ocfs2_dinode *di)
> +{
> +	struct defrag_dir_context dc;
> +	uint64_t tmpblkno;
> +	errcode_t ret;
> +	int offset = 0, bs = dst->dst_fs->fs_blocksize;
> +
> +	/* XXX: Ignore refcounted dir for now */
> +	if (di->i_dyn_features & (OCFS2_INLINE_DATA_FL|OCFS2_HAS_REFCOUNT_FL))
> +		return 0;

	Directories can't be refcounted.  That would be a corrupt
filesystem.  Just check for inline data.

> +	/*Initialize dc */
> +	memset(&dc, 0, sizeof(struct defrag_dir_context));
> +	dc.dst = dst;
> +	dc.prev_offset = -1;
> +	dc.old_inode = di;
> +
> +	ret = ocfs2_malloc_block(dst->dst_fs->fs_io, &dc.w_buf);
> +	if (ret) {
> +		com_err(whoami, ret, "while allocating memory\n");
> +		goto out;
> +	}
> +	memset(dc.w_buf, 0, bs);
> +
> +	ret = ocfs2_new_inode(dst->dst_fs, &tmpblkno, di->i_mode);
> +	if (ret) {
> +		com_err(whoami, ret, "while creating inode\n");
> +		goto out;
> +	}
> +
> +	ret = ocfs2_read_cached_inode(dst->dst_fs, tmpblkno, &dc.new_inode);
> +	if (ret) {
> +		com_err(whoami, ret, "while reading cached inode\n");
> +		goto out;
> +	}
> +	/* XXX Hackish - reversing what ocfs2_init_inode did to the cached
> +	   inode */
> +	dc.new_inode->ci_inode->i_dyn_features &= ~OCFS2_INLINE_DATA_FL;
> +	ocfs2_dinode_new_extent_list(dst->dst_fs, dc.new_inode->ci_inode);

	This isn't actually hackish.  It's exactly what you should do.
However, you might want to write out the inode at this point, so other
functions can read it.
	Ok, here's how I think you should copy the dirent data.  I think
you should do a two-pass loop:

First pass:
	1) Walk the directory.
	  a) Save a list of all the (name, blockno) you find.
	  b) Save the total of all the space needed.  This you can get
	     by adding up DIR_REC_LEN(dirent->name_len) for each
	     directory entry.  Remember to add the space needed for
	     trailers if they are enabled.  When done, you'll be able to
	     calculate the number of dirblocks you minimally need.  Add
	     a few more dirblocks (10?) for slop if your directory is
	     large.

Now allocate your new defrag dir and grow it to have enough dirblocks.
You can then call ocfs2_init_dir().

Second pass:
	1) For each name in the list you saved off
	  a) link that name into the new directory with ocfs2_link()
	  b) remove it from the old directory with ocfs2_unlink()

Now you've got the stuff saved without having to code the dir copy.  The
old directory is now empty, so you can truncate it.  Move the extents
back from the new directory, and fix up the '.' record.
	"But Joel," you ask, "Won't it be really slow if every
ocfs2_link() and ocfs2_unlink() call writes the changes to disk?"  Yes,
it would, except that I think defrag.ocfs2 should run with
OCFS2_FLAG_BUFFERED.  There's no reason to run in O_DIRECT.  Let the
page cache handle your performance.
	File defrag review tomorrow or so.

Joel

-- 

"Ninety feet between bases is perhaps as close as man has ever come
 to perfection."
	- Red Smith

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-tools-devel mailing list