[Ocfs2-tools-devel] [PATCH 07/15] dx_dirs v11: more library support for directory indexing
Tao Ma
tao.ma at oracle.com
Sun Apr 25 18:50:47 PDT 2010
Hi coly,
Coly Li wrote:
>
> On 04/23/2010 05:18 PM, Tao Ma Wrote:
>> <snip>
>>> +errcode_t ocfs2_new_dx_root(ocfs2_filesys *fs,
>>> + struct ocfs2_dinode *di,
>>> + uint64_t *dr_blkno)
>>> +{
>>> + errcode_t ret;
>>> + char *buf = NULL;
>>> + uint64_t gd_blkno;
>>> + struct ocfs2_dx_root_block *dx_root;
>>> + int slot;
>>> +
>>> + ret = ocfs2_malloc_block(fs->fs_io, &buf);
>>> + if (ret)
>>> + goto out;
>>> +
>>> + slot = di->i_suballoc_slot;
>>> + ret = ocfs2_load_allocator(fs, EXTENT_ALLOC_SYSTEM_INODE,
>>> + slot, &fs->fs_eb_allocs[slot]);
>> why you want to use the suballoc_slot of the inode? the i_suballoc_slot
>> means the slot for inode_alloc while you load the allocators from
>> extent_alloc. I pointed it out because here is a bug. For "/", it is
>> allocated from global_alloc, so i_suballoc_slot is INVALID_SLOT, so
>> there is no corresponding extent alloc, and &fs->fs_eb_allocs[slot] will
>> overflow.
>
> The reason is, I assum/guess, most of time, the node (machine) operating on dx_root is probably the node (machine)
> operating on inode. Using the same alloc slot, may have less performance cost for future accessing if inode and dx_root
> belongs to same alloc slot.
ok, btw, I also noticed that the bug is fixed by another Mark's patch
e618ad9a6cafae5351f87ae0601d3b16ec9af96a.
So I have no objection here.
>
>> <snip>
>>> +static int dir_trailer_func(ocfs2_filesys *fs,
>>> + uint64_t blkno,
>>> + uint64_t bcount,
>>> + uint16_t ext_flags,
>>> + void *priv_data)
>>> +{
>>> + struct trailer_ctxt *ctxt = (struct trailer_ctxt *)priv_data;
>>> + struct ocfs2_dinode *di = ctxt->di;
>>> + struct ocfs2_dx_root_block *dx_root = ctxt->dx_root;
>>> + struct ocfs2_dir_block_trailer *trailer;
>>> + int max_rec_len = 0;
>>> + errcode_t ret = 0;
>>> + char *blk = NULL;
>>> +
>>> + ret = ocfs2_malloc_block(fs->fs_io, &blk);
>>> + if (ret)
>>> + goto out;
>>> +
>>> + /* here we don't trust trailer, cannot use
>>> + * ocfs2_read_dir_block() */
>>> + ret = ocfs2_read_blocks(fs, blkno, 1, blk);
>>> + if (ret)
>>> + goto out;
>>> + ret = ocfs2_check_dir_trailer_space(fs, di, blkno, blk);
>>> + if (ret)
>>> + goto out;
>>> + ocfs2_init_dir_trailer(fs, di, blkno, blk);
>>> + max_rec_len = ocfs2_find_max_rec_len(fs, blk);
>>> + trailer = ocfs2_dir_trailer_from_block(fs, blk);
>>> + trailer->db_free_rec_len = max_rec_len;
>>> +
>>> + if (max_rec_len) {
>>> + trailer->db_free_next = dx_root->dr_free_blk;
>>> + dx_root->dr_free_blk = blkno;
>>> + }
>>> +
>>> + /* comput trailer->db_check here, after writes out,
>>> + * trailer is trustable */
>>> + ret = ocfs2_write_dir_block(fs, di, blkno, blk);
>> I am just curious about one thing. We have added the trailer to the end
>> of the dir block. Do we need to update the rec_len of the last record of
>> the dir block? Or am I missing something here?
>
> The rec_len is not updated here.
> So far, the code assumes: if kernel or user space tool creates a new directory, and trailer is supported, kernel or the
> user space tool should make sure the rec_len is correct.
>
> In user space tools, only tunefs.ocfs2 operates on trailer stuffs when enable or disable trailer related features
> (metaecc, indexed-dirs). tunefs_install_dir_trailer() will make sure, all valid dir entries will not overlapped with the
> trailer area. If the overlap happens, tunefs_install_dir_trailer()->run_dirblocks()->fixup_dirblock() finally updates
> the rec_len,
>
> static errcode_t fixup_dirblock(ocfs2_filesys *fs,
> [snip]
> /*
> * Now that we've moved any dirents out of the way, we need to
> * fix up db_last and install the trailer.
> */
> offset = ((char *)db->db_last) - db->db_buf;
> [snip]
>
> Therefore, I trust rec_len is correct.
ok, thanks for the explanation.
>
>
>>> +static errcode_t ocfs2_read_dx_leaves(ocfs2_filesys *fs,
>>> + uint64_t start,
>>> + int num,
>>> + struct ocfs2_dx_leaf **dx_leaves)
>>> +{
>>> + errcode_t ret;
>>> + int i;
>>> + struct ocfs2_dx_leaf *dx_leaf;
>>> + for (i = 0; i < num; i++) {
>>> + assert(!dx_leaves[i]);
>>> + ret = ocfs2_malloc_block(fs->fs_io, (char **)&dx_leaf);
>>> + if (ret)
>>> + goto bail;
>>> + ret = ocfs2_read_dx_leaf(fs, start + i, (char *)dx_leaf);
>>> + if (ret)
>>> + goto bail;
>>> + dx_leaves[i] = dx_leaf;
>>> + }
>>> + goto out;
>>> +
>>> +bail:
>>> + for (; i >= 0; i--) {
>>> + if (dx_leaves[i])
>>> + ocfs2_free(&dx_leaves[i]);
>> We need to set dx_leaves[i] to NULL here. Otherwise the callers of
>> ocfs2_read_dx_leaves will double free them. At least I see the caller
>> ocfs2_dx_dir_rebalance call ocfs2_dx_dir_free_leaves which free
>> dx_leaves[i] again.
>
> ocfs2_free() frees dx_leaves[i] and sets dx_leaves[i] to NULL as well. So we don't need to worry here.
Oh, I never know ocfs2_free is so powerful. ;)
Regards,
Tao
More information about the Ocfs2-tools-devel
mailing list