[Ocfs2-tools-devel] [PATCH 07/12] dx_dirs v7: more library support for directory indexing
Coly Li
coly.li at suse.de
Thu Feb 4 09:15:49 PST 2010
On 2010年02月03日 07:54, Mark Fasheh Wrote:
> On Sat, Jan 30, 2010 at 02:18:26AM +0800, Coly Li wrote:
>> This patch adds more library support for indexed dirs,
>> - dx_root alloc/delete
>> - dx_leaf alloc/delete
>> - dx_root read/write
>> - dx_leaf read/write
>> - indexed tree insert/truncate
>> - dx_root extent tree operations
>>
>> With this patch, indexed dirs support in fsck.ocfs2 is possible.
>>
>> Signed-off-by: Coly Li <coly.li at suse.de>
>> Cc: Mark Fasheh <mfasheh at suse.com>
>> ---
>>
>> diff --git a/include/ocfs2/ocfs2.h b/include/ocfs2/ocfs2.h
>> index b586bf2..1c2c69a 100644
>> --- a/include/ocfs2/ocfs2.h
>> +++ b/include/ocfs2/ocfs2.h
>> @@ -1382,5 +1396,12 @@ errcode_t ocfs2_extent_iterate_xattr(ocfs2_filesys *fs,
>> void *priv_data,
>> int *changed);
>> errcode_t ocfs2_delete_xattr_block(ocfs2_filesys *fs, uint64_t blkno);
>> +errcode_t ocfs2_dir_indexed_tree_truncate(ocfs2_filesys *fs,
>> + struct ocfs2_dx_root_block *dx_root);
>> +errcode_t ocfs2_write_dx_root(ocfs2_filesys *fs, uint64_t block, char *buf);
>> +int ocfs2_is_dir_trailer(ocfs2_filesys *fs, struct ocfs2_dinode *di, unsigned long de_off);
>
> Duplicate definition, noted below.
>
Fixed, thanks.
>
>> diff --git a/libocfs2/alloc.c b/libocfs2/alloc.c
>> index 18c6e0e..4144539 100644
>> --- a/libocfs2/alloc.c
>> +++ b/libocfs2/alloc.c
>> @@ -492,6 +492,100 @@ out:
>> return ret;
>> }
>>
>> +/* only initiate part of dx_root:
>> + * dr_subllaoc_slot
>> + * dr_sbualloc_bit
>> + * dr_fs_generation
>> + * dr_blkno
>> + */
>> +static void init_dx_root(ocfs2_filesys *fs,
>> + struct ocfs2_dx_root_block *dx_root,
>> + int slot, uint64_t gd_blkno, uint64_t dr_blkno)
>> +{
>> +
>> + memset(dx_root, 0, fs->fs_blocksize);
>> + strcpy((char *)dx_root->dr_signature, OCFS2_DX_ROOT_SIGNATURE);
>> + dx_root->dr_suballoc_slot = slot;
>> + dx_root->dr_suballoc_bit = (uint16_t)(dr_blkno - gd_blkno);
>> + dx_root->dr_fs_generation = fs->fs_super->i_fs_generation;
>> + dx_root->dr_blkno = dr_blkno;
>
> Why do you wait until build_dx_root to initialize dr_flags to
> OCFS2_DX_FLAG_INLINE?
>
Hmm, I wanted this flag to be set explicitly by the caller. Maybe set it inside init_dx_root can make less confusion.
Fixed.
>> @@ -636,6 +730,7 @@ out:
>> return ret;
>> }
>>
>> +
>
> Added an extra newline here. No big deal, but thought I'd point it out.
>
Fixed.
>
>> #ifdef DEBUG_EXE
>> #include <stdio.h>
>>
>> diff --git a/libocfs2/dir_indexed.c b/libocfs2/dir_indexed.c
[snip]
>> +errcode_t ocfs2_dx_dir_truncate(ocfs2_filesys *fs,
>> + uint64_t dir)
>> +{
>> + struct ocfs2_dx_root_block *dx_root;
>> + char *dx_root_buf = NULL, *di_buf = NULL;
>> + struct ocfs2_dinode *di;
>> + errcode_t ret = 0;
>> +
>> + ret = ocfs2_malloc_block(fs->fs_io, &di_buf);
>> + if (ret)
>> + goto out;
>> + ret = ocfs2_read_inode(fs, dir, di_buf);
>> + if (ret)
>> + goto out;
>> + di = (struct ocfs2_dinode *)di_buf;
>> +
>
> This is a function intended to be called by users of libocfs2, right? We
> should probably check S_ISDIR() here too then in case the user passes the
> wrong inode # down.
>
Yes, I should make sure it's directory inode first of all. Fixed.
>
>> + /* we have to trust i_dyn_features */
>> + if ((!ocfs2_dir_indexed(di)) ||
>> + (di->i_dyn_features & OCFS2_INLINE_DATA_FL))
>> + goto out;
>> +
>> + ret = ocfs2_malloc_block(fs->fs_io, &dx_root_buf);
>> + if (ret)
>> + goto out;
>> + ret = ocfs2_read_dx_root(fs, (uint64_t)di->i_dx_root, dx_root_buf);
>> + if (ret)
>> + goto out;
>> + dx_root = (struct ocfs2_dx_root_block *)dx_root_buf;
>> +
>> + if (dx_root->dr_flags & OCFS2_DX_FLAG_INLINE)
>> + goto remove_index;
>> +
>> + ret = ocfs2_dir_indexed_tree_truncate(fs, dx_root);
>> +
>> +remove_index:
>> + ret = ocfs2_delete_dx_root(fs, dx_root->dr_blkno);
>> +
>> + di->i_dyn_features &= ~OCFS2_INDEXED_DIR_FL;
>> + di->i_dx_root = 0;
>> +
>> + ret = ocfs2_write_inode(fs, di->i_blkno, (char *)di);
>
> Return values from ocfs2_dir_indexed_tree_truncate(),
> ocfs2_delete_dx_root() and ocfs2_write_inode() are being ignored here. If
> there's a good reason, we should have a comment.
>
I idea was, even the indexed tree truncate failed, we still wanted to delete the dx_root.
Therefore, I didn't check the return value.
> We could also re-order the operations so that errors in the tree truncate
> and root delete aren't fatal. Remove any notion of the index from the inode
> block first and then write it - if you fail the write, we can exit
> gracefully. If any of the other steps (truncate, delete_dx_root) fail after
> the inode has been updated, we *could* ignore those errors as non-fatal.
>
Yes, I re-order this part, and add necessary comments for the code.
[snip]
>> +static int dir_trailer_func(ocfs2_filesys *fs,
>> + uint64_t blkno,
>> + uint64_t bcount,
>> + uint16_t ext_flags,
>> + void *priv_data)
>> +{
[snip]
>> + max_rec_len = ocfs2_find_max_rec_len(fs, blk);
>> + trailer = ocfs2_dir_trailer_from_block(fs, blk);
>> + trailer->db_free_rec_len = max_rec_len;
>> + ocfs2_init_dir_trailer(fs, di, blkno, blk);
>
> Doesn't ocfs2_init_dir_trailer() overwrite the db_free_rec_len set you just
> did? Should you move the init a line or two up?
>
Oops, Fixed now :-)
>
> Also, I think we're missing a step in this function - you need to change the
> rec_len of the dirent which comes before the trailer. Otherwise it will
> still point to the end of the block, which means the trailer might later be
> overwritten by a dirent insert.
>
IMHO, this can/should be done in ocfs2_init_dir_trailer().
A open-question is, if the trailer will overwrite part of the name of the previous dirent, how to resolve this issue ?
Should we re-insert this dirent to somewhere else ?
[snip]
>> +static void ocfs2_dx_dir_leaf_insert_tail(struct ocfs2_dx_leaf *dx_leaf,
>> + struct ocfs2_dx_entry *dx_new_entry)
>> +{
>> + int i;
>> +
>> + i = dx_leaf->dl_list.de_num_used;
>> + dx_leaf->dl_list.de_entries[i] = *dx_new_entry;
>> +
>> + dx_leaf->dl_list.de_num_used += 1;
>> +}
>> +
>> +/* XXX should add bail part */
>
> What does this comment mean?
>
Here I mean, once an error happen, the function should not just simply return, because the on-disk data is changed.
There should be some code to make the on-disk data recovered to the content before ocfs2_dx_dir_leaf_insert_tail() got
called.
But no, I should not use 'bail part' in the comments here. Anyway, sometimes I feel it's quite difficult to make it like,
ret = some_operation_touchs_disk();
if (ret) {
recovery everything to unchanged status
goto out;
}
I guess this is a goal I should try best to.
>
>> +static errcode_t ocfs2_expand_inline_dx_root(ocfs2_filesys *fs,
>> + struct ocfs2_dx_root_block *dx_root)
>> +{
[snip]
>> +
>> + for (i = 0; i < entry_list->de_num_used; i++) {
>> + dx_entry = &entry_list->de_entries[i];
>> + j = __ocfs2_dx_dir_hash_idx(fs, dx_entry->dx_minor_hash);
>> + target_leaf = (struct ocfs2_dx_leaf *)dx_leaves[j];
>> +
>> + ocfs2_dx_dir_leaf_insert_tail(target_leaf, dx_entry);
>> + ret = ocfs2_write_dx_leaf(fs, target_leaf->dl_blkno,
>> + target_leaf);
>> + if (ret)
>> + goto out;
>
> This operation would be faster if we moved the writes out of this look.
> Otherwise, we're doing a disk I/O for every entry...
>
Fixed.
>
>> + }
>> +
>> + dx_root->dr_flags &= ~OCFS2_DX_FLAG_INLINE;
>> + memset(&dx_root->dr_list, 0, fs->fs_blocksize -
>> + offsetof(struct ocfs2_dx_root_block, dr_list));
>> + dx_root->dr_list.l_count =
>> + ocfs2_extent_recs_per_dx_root(fs->fs_blocksize);
>> +
>> + /* This should never fail considering we start with an empty
>> + * dx_root */
>> + ocfs2_init_dx_root_extent_tree(&et, fs, (char *)dx_root, dx_root->dr_blkno);
>> + ret = ocfs2_tree_insert_extent(fs, &et, 0, start_blkno, 1, 0);
>> + if (ret)
>> + goto out;
>> +
>> +out:
>> + return ret;
>> +}
>> +
>
> Rest of the patch looks reasonable to me. One comment - I don't see where
> the new dx functions (like insert entry, remove entry, etc) are being
> plugged into the user-exported api for doing these things. Specifically, I'm
> thinking ocfs2_link(), ocfs2_unlink(), ocfs2_init_dir(), ocfs2_lookup(),
> ocfs2_lookup_system_inode(), etc. Basically, if the fs has indexed
> directories, I want the existing tools to automatically use them without
> much (if any) modification.
>
> Generally, this probably just means that you add code to check whether the
> directory is indexed and call the dx_* function if need be.
Get it. I am working on this fix, will post out a v8 patch very soon.
Thanks for your review.
--
Coly Li
SuSE Labs
More information about the Ocfs2-tools-devel
mailing list