[Ocfs2-tools-devel] [PATCH 07/15] dx_dirs v11: more library support for directory indexing

Tao Ma tao.ma at oracle.com
Sun Apr 25 18:50:47 PDT 2010


Hi coly,
Coly Li wrote:
> 
> On 04/23/2010 05:18 PM, Tao Ma Wrote:
>> <snip>
>>> +errcode_t ocfs2_new_dx_root(ocfs2_filesys *fs,
>>> +                struct ocfs2_dinode *di,
>>> +                uint64_t *dr_blkno)
>>> +{
>>> +    errcode_t ret;
>>> +    char *buf = NULL;
>>> +    uint64_t gd_blkno;
>>> +    struct ocfs2_dx_root_block *dx_root;
>>> +    int slot;
>>> +
>>> +    ret = ocfs2_malloc_block(fs->fs_io, &buf);
>>> +    if (ret)
>>> +        goto out;
>>> +
>>> +    slot = di->i_suballoc_slot;
>>> +    ret = ocfs2_load_allocator(fs, EXTENT_ALLOC_SYSTEM_INODE,
>>> +                slot, &fs->fs_eb_allocs[slot]);
>> why you want to use the suballoc_slot of the inode? the i_suballoc_slot
>> means the slot for inode_alloc while you load the allocators from
>> extent_alloc. I pointed it out because here is a bug. For "/", it is
>> allocated from global_alloc, so i_suballoc_slot is INVALID_SLOT, so
>> there is no corresponding extent alloc, and &fs->fs_eb_allocs[slot] will
>> overflow.
> 
> The reason is, I assum/guess, most of time, the node (machine) operating on dx_root is probably the node (machine)
> operating on inode. Using the same alloc slot, may have less performance cost for future accessing if inode and dx_root
> belongs to same alloc slot.
ok, btw, I also noticed that the bug is fixed by another Mark's patch
e618ad9a6cafae5351f87ae0601d3b16ec9af96a.
So I have no objection here.
> 
>> <snip>
>>> +static int dir_trailer_func(ocfs2_filesys *fs,
>>> +                uint64_t blkno,
>>> +                uint64_t bcount,
>>> +                uint16_t ext_flags,
>>> +                void *priv_data)
>>> +{
>>> +    struct trailer_ctxt *ctxt = (struct trailer_ctxt *)priv_data;
>>> +    struct ocfs2_dinode *di = ctxt->di;
>>> +    struct ocfs2_dx_root_block *dx_root = ctxt->dx_root;
>>> +    struct ocfs2_dir_block_trailer *trailer;
>>> +    int max_rec_len = 0;
>>> +    errcode_t ret = 0;
>>> +    char *blk = NULL;
>>> +
>>> +    ret = ocfs2_malloc_block(fs->fs_io, &blk);
>>> +    if (ret)
>>> +        goto out;
>>> +
>>> +    /* here we don't trust trailer, cannot use
>>> +     * ocfs2_read_dir_block() */
>>> +    ret = ocfs2_read_blocks(fs, blkno, 1, blk);
>>> +    if (ret)
>>> +        goto out;
>>> +    ret = ocfs2_check_dir_trailer_space(fs, di, blkno, blk);
>>> +    if (ret)
>>> +        goto out;
>>> +    ocfs2_init_dir_trailer(fs, di, blkno, blk);
>>> +    max_rec_len = ocfs2_find_max_rec_len(fs, blk);
>>> +    trailer = ocfs2_dir_trailer_from_block(fs, blk);
>>> +    trailer->db_free_rec_len = max_rec_len;
>>> +
>>> +    if (max_rec_len) {
>>> +        trailer->db_free_next = dx_root->dr_free_blk;
>>> +        dx_root->dr_free_blk = blkno;
>>> +    }
>>> +
>>> +    /* comput trailer->db_check here, after writes out,
>>> +     * trailer is trustable */
>>> +    ret = ocfs2_write_dir_block(fs, di, blkno, blk);
>> I am just curious about one thing. We have added the trailer to the end
>> of the dir block. Do we need to update the rec_len of the last record of
>> the dir block? Or am I missing something here?
> 
> The rec_len is not updated here.
> So far, the code assumes: if kernel or user space tool creates a new directory, and trailer is supported, kernel or the
> user space tool should make sure the rec_len is correct.
> 
> In user space tools, only tunefs.ocfs2 operates on trailer stuffs when enable or disable trailer related features
> (metaecc, indexed-dirs). tunefs_install_dir_trailer() will make sure, all valid dir entries will not overlapped with the
> trailer area. If the overlap happens, tunefs_install_dir_trailer()->run_dirblocks()->fixup_dirblock() finally updates
> the rec_len,
> 
> static errcode_t fixup_dirblock(ocfs2_filesys *fs,
> [snip]
>         /*
>          * Now that we've moved any dirents out of the way, we need to
>          * fix up db_last and install the trailer.
>          */
>         offset = ((char *)db->db_last) - db->db_buf;
> [snip]
> 
> Therefore, I trust rec_len is correct.
ok, thanks for the explanation.
> 
> 
>>> +static errcode_t ocfs2_read_dx_leaves(ocfs2_filesys *fs,
>>> +                uint64_t start,
>>> +                int num,
>>> +                struct ocfs2_dx_leaf **dx_leaves)
>>> +{
>>> +    errcode_t ret;
>>> +    int i;
>>> +    struct ocfs2_dx_leaf *dx_leaf;
>>> +    for (i = 0; i < num; i++) {
>>> +        assert(!dx_leaves[i]);
>>> +        ret = ocfs2_malloc_block(fs->fs_io, (char **)&dx_leaf);
>>> +        if (ret)
>>> +            goto bail;
>>> +        ret = ocfs2_read_dx_leaf(fs, start + i, (char *)dx_leaf);
>>> +        if (ret)
>>> +            goto bail;
>>> +        dx_leaves[i] = dx_leaf;
>>> +    }
>>> +    goto out;
>>> +
>>> +bail:
>>> +    for (; i >= 0; i--) {
>>> +        if (dx_leaves[i])
>>> +            ocfs2_free(&dx_leaves[i]);
>> We need to set dx_leaves[i] to NULL here. Otherwise the callers of
>> ocfs2_read_dx_leaves will double free them. At least I see the caller
>> ocfs2_dx_dir_rebalance call ocfs2_dx_dir_free_leaves which free
>> dx_leaves[i] again.
> 
> ocfs2_free() frees dx_leaves[i] and sets dx_leaves[i] to NULL as well. So we don't need to worry here.
Oh, I never know ocfs2_free is so powerful. ;)

Regards,
Tao



More information about the Ocfs2-tools-devel mailing list