[Ocfs2-tools-devel] [PATCH 07/12] dx_dirs v7: more library support for directory indexing

Coly Li coly.li at suse.de
Thu Feb 4 09:15:49 PST 2010



On 2010年02月03日 07:54, Mark Fasheh Wrote:
> On Sat, Jan 30, 2010 at 02:18:26AM +0800, Coly Li wrote:
>> This patch adds more library support for indexed dirs,
>> - dx_root alloc/delete
>> - dx_leaf alloc/delete
>> - dx_root read/write
>> - dx_leaf read/write
>> - indexed tree insert/truncate
>> - dx_root extent tree operations
>>
>> With this patch, indexed dirs support in fsck.ocfs2 is possible.
>>
>> Signed-off-by: Coly Li <coly.li at suse.de>
>> Cc: Mark Fasheh <mfasheh at suse.com>
>> ---
>>
>> diff --git a/include/ocfs2/ocfs2.h b/include/ocfs2/ocfs2.h
>> index b586bf2..1c2c69a 100644
>> --- a/include/ocfs2/ocfs2.h
>> +++ b/include/ocfs2/ocfs2.h
>> @@ -1382,5 +1396,12 @@ errcode_t ocfs2_extent_iterate_xattr(ocfs2_filesys *fs,
>>  				     void *priv_data,
>>  				     int *changed);
>>  errcode_t ocfs2_delete_xattr_block(ocfs2_filesys *fs, uint64_t blkno);
>> +errcode_t ocfs2_dir_indexed_tree_truncate(ocfs2_filesys *fs,
>> +					struct ocfs2_dx_root_block *dx_root);
>> +errcode_t ocfs2_write_dx_root(ocfs2_filesys *fs, uint64_t block, char *buf);
>> +int ocfs2_is_dir_trailer(ocfs2_filesys *fs, struct ocfs2_dinode *di, unsigned long de_off);
> 
> Duplicate definition, noted below.
> 

Fixed, thanks.

> 
>> diff --git a/libocfs2/alloc.c b/libocfs2/alloc.c
>> index 18c6e0e..4144539 100644
>> --- a/libocfs2/alloc.c
>> +++ b/libocfs2/alloc.c
>> @@ -492,6 +492,100 @@ out:
>>  	return ret;
>>  }
>>
>> +/* only initiate part of dx_root:
>> + *   dr_subllaoc_slot
>> + *   dr_sbualloc_bit
>> + *   dr_fs_generation
>> + *   dr_blkno
>> + */
>> +static void init_dx_root(ocfs2_filesys *fs,
>> +			struct ocfs2_dx_root_block *dx_root,
>> +			int slot, uint64_t gd_blkno, uint64_t dr_blkno)
>> +{
>> +
>> +	memset(dx_root, 0, fs->fs_blocksize);
>> +	strcpy((char *)dx_root->dr_signature, OCFS2_DX_ROOT_SIGNATURE);
>> +	dx_root->dr_suballoc_slot = slot;
>> +	dx_root->dr_suballoc_bit = (uint16_t)(dr_blkno - gd_blkno);
>> +	dx_root->dr_fs_generation = fs->fs_super->i_fs_generation;
>> +	dx_root->dr_blkno = dr_blkno;
> 
> Why do you wait until build_dx_root to initialize dr_flags to
> OCFS2_DX_FLAG_INLINE?
> 

Hmm, I wanted this flag to be set explicitly by the caller. Maybe set it inside init_dx_root can make less confusion.
Fixed.

>> @@ -636,6 +730,7 @@ out:
>>  	return ret;
>>  }
>>
>> +
> 
> Added an extra newline here. No big deal, but thought I'd point it out.
> 

Fixed.

> 
>>  #ifdef DEBUG_EXE
>>  #include <stdio.h>
>>
>> diff --git a/libocfs2/dir_indexed.c b/libocfs2/dir_indexed.c
[snip]
>> +errcode_t ocfs2_dx_dir_truncate(ocfs2_filesys *fs,
>> +			uint64_t dir)
>> +{
>> +	struct ocfs2_dx_root_block *dx_root;
>> +	char *dx_root_buf = NULL, *di_buf = NULL;
>> +	struct ocfs2_dinode *di;
>> +	errcode_t ret = 0;
>> +
>> +	ret = ocfs2_malloc_block(fs->fs_io, &di_buf);
>> +	if (ret)
>> +		goto out;
>> +	ret = ocfs2_read_inode(fs, dir, di_buf);
>> +	if (ret)
>> +		goto out;
>> +	di = (struct ocfs2_dinode *)di_buf;
>> +
> 
> This is a function intended to be called by users of libocfs2, right? We
> should probably check S_ISDIR() here too then in case the user passes the
> wrong inode # down.
> 

Yes, I should make sure it's directory inode first of all. Fixed.

> 
>> +	/* we have to trust i_dyn_features */
>> +	if ((!ocfs2_dir_indexed(di)) ||
>> +	    (di->i_dyn_features & OCFS2_INLINE_DATA_FL))
>> +		goto out;
>> +
>> +	ret = ocfs2_malloc_block(fs->fs_io, &dx_root_buf);
>> +	if (ret)
>> +		goto out;
>> +	ret = ocfs2_read_dx_root(fs, (uint64_t)di->i_dx_root, dx_root_buf);
>> +	if (ret)
>> +		goto out;
>> +	dx_root = (struct ocfs2_dx_root_block *)dx_root_buf;
>> +
>> +	if (dx_root->dr_flags & OCFS2_DX_FLAG_INLINE)
>> +		goto remove_index;
>> +
>> +	ret = ocfs2_dir_indexed_tree_truncate(fs, dx_root);
>> +
>> +remove_index:
>> +	ret = ocfs2_delete_dx_root(fs, dx_root->dr_blkno);
>> +
>> +	di->i_dyn_features &= ~OCFS2_INDEXED_DIR_FL;
>> +	di->i_dx_root = 0;
>> +
>> +	ret = ocfs2_write_inode(fs, di->i_blkno, (char *)di);
> 
> Return values from ocfs2_dir_indexed_tree_truncate(),
> ocfs2_delete_dx_root() and ocfs2_write_inode() are being ignored here. If
> there's a good reason, we should have a comment.
> 

I idea was, even the indexed tree truncate failed, we still wanted to delete the dx_root.
Therefore, I didn't check the return value.

> We could also re-order the operations so that errors in the tree truncate
> and root delete aren't fatal. Remove any notion of the index from the inode
> block first and then write it - if you fail the write, we can exit
> gracefully. If any of the other steps (truncate, delete_dx_root) fail after
> the inode has been updated, we *could* ignore those errors as non-fatal.
> 

Yes, I re-order this part, and add necessary comments for the code.

[snip]
>> +static int dir_trailer_func(ocfs2_filesys *fs,
>> +				uint64_t blkno,
>> +				uint64_t bcount,
>> +				uint16_t ext_flags,
>> +				void *priv_data)
>> +{
[snip]
>> +	max_rec_len = ocfs2_find_max_rec_len(fs, blk);
>> +	trailer = ocfs2_dir_trailer_from_block(fs, blk);
>> +	trailer->db_free_rec_len = max_rec_len;
>> +	ocfs2_init_dir_trailer(fs, di, blkno, blk);
> 
> Doesn't ocfs2_init_dir_trailer() overwrite the db_free_rec_len set you just
> did? Should you move the init a line or two up?
> 

Oops, Fixed now :-)

> 
> Also, I think we're missing a step in this function - you need to change the
> rec_len of the dirent which comes before the trailer. Otherwise it will
> still point to the end of the block, which means the trailer might later be
> overwritten by a dirent insert.
> 

IMHO, this can/should be done in ocfs2_init_dir_trailer().
A open-question is, if the trailer will overwrite part of the name of the previous dirent, how to resolve this issue ?
Should we re-insert this dirent to somewhere else ?

[snip]
>> +static void ocfs2_dx_dir_leaf_insert_tail(struct ocfs2_dx_leaf *dx_leaf,
>> +				struct ocfs2_dx_entry *dx_new_entry)
>> +{
>> +	int i;
>> +
>> +	i = dx_leaf->dl_list.de_num_used;
>> +	dx_leaf->dl_list.de_entries[i] = *dx_new_entry;
>> +
>> +	dx_leaf->dl_list.de_num_used += 1;
>> +}
>> +
>> +/* XXX should add bail part */
> 
> What does this comment mean?
> 
Here I mean, once an error happen, the function should not just simply return, because the on-disk data is changed.
There should be some code to make the on-disk data recovered to the content before ocfs2_dx_dir_leaf_insert_tail() got
called.

But no, I should not use 'bail part' in the comments here. Anyway, sometimes I feel it's quite difficult to make it like,

ret = some_operation_touchs_disk();
if (ret) {
	recovery everything to unchanged status
	goto out;
}

I guess this is a goal I should try best to.

> 
>> +static errcode_t ocfs2_expand_inline_dx_root(ocfs2_filesys *fs,
>> +					struct ocfs2_dx_root_block *dx_root)
>> +{
[snip]
>> +
>> +	for (i = 0; i < entry_list->de_num_used; i++) {
>> +		dx_entry = &entry_list->de_entries[i];
>> +		j = __ocfs2_dx_dir_hash_idx(fs, dx_entry->dx_minor_hash);
>> +		target_leaf = (struct ocfs2_dx_leaf *)dx_leaves[j];
>> +
>> +		ocfs2_dx_dir_leaf_insert_tail(target_leaf, dx_entry);
>> +		ret = ocfs2_write_dx_leaf(fs, target_leaf->dl_blkno,
>> +					target_leaf);
>> +		if (ret)
>> +			goto out;
> 
> This operation would be faster if we moved the writes out of this look.
> Otherwise, we're doing a disk I/O for every entry...
> 

Fixed.

> 
>> +	}
>> +
>> +	dx_root->dr_flags &= ~OCFS2_DX_FLAG_INLINE;
>> +	memset(&dx_root->dr_list, 0, fs->fs_blocksize -
>> +		offsetof(struct ocfs2_dx_root_block, dr_list));
>> +	dx_root->dr_list.l_count =
>> +		ocfs2_extent_recs_per_dx_root(fs->fs_blocksize);
>> +
>> +	/* This should never fail considering we start with an empty
>> +	 * dx_root */
>> +	ocfs2_init_dx_root_extent_tree(&et, fs, (char *)dx_root, dx_root->dr_blkno);
>> +	ret = ocfs2_tree_insert_extent(fs, &et, 0, start_blkno, 1, 0);
>> +	if (ret)
>> +		goto out;
>> +
>> +out:
>> +	return ret;
>> +}
>> +
> 
> Rest of the patch looks reasonable to me. One comment - I don't see where
> the new dx functions (like insert entry, remove entry, etc) are being
> plugged into the user-exported api for doing these things. Specifically, I'm
> thinking ocfs2_link(), ocfs2_unlink(), ocfs2_init_dir(), ocfs2_lookup(),
> ocfs2_lookup_system_inode(), etc. Basically, if the fs has indexed
> directories, I want the existing tools to automatically use them without
> much (if any) modification.
> 
> Generally, this probably just means that you add code to check whether the
> directory is indexed and call the dx_* function if need be.

Get it. I am working on this fix, will post out a v8 patch very soon.

Thanks for your review.

-- 
Coly Li
SuSE Labs



More information about the Ocfs2-tools-devel mailing list