[Ocfs2-tools-devel] [PATCH 07/15] dx_dirs v11: more library support for directory indexing
Coly Li
coly.li at suse.de
Sun Apr 25 11:03:16 PDT 2010
On 04/23/2010 05:18 PM, Tao Ma Wrote:
> <snip>
>> +errcode_t ocfs2_new_dx_root(ocfs2_filesys *fs,
>> + struct ocfs2_dinode *di,
>> + uint64_t *dr_blkno)
>> +{
>> + errcode_t ret;
>> + char *buf = NULL;
>> + uint64_t gd_blkno;
>> + struct ocfs2_dx_root_block *dx_root;
>> + int slot;
>> +
>> + ret = ocfs2_malloc_block(fs->fs_io, &buf);
>> + if (ret)
>> + goto out;
>> +
>> + slot = di->i_suballoc_slot;
>> + ret = ocfs2_load_allocator(fs, EXTENT_ALLOC_SYSTEM_INODE,
>> + slot, &fs->fs_eb_allocs[slot]);
> why you want to use the suballoc_slot of the inode? the i_suballoc_slot
> means the slot for inode_alloc while you load the allocators from
> extent_alloc. I pointed it out because here is a bug. For "/", it is
> allocated from global_alloc, so i_suballoc_slot is INVALID_SLOT, so
> there is no corresponding extent alloc, and &fs->fs_eb_allocs[slot] will
> overflow.
The reason is, I assum/guess, most of time, the node (machine) operating on dx_root is probably the node (machine)
operating on inode. Using the same alloc slot, may have less performance cost for future accessing if inode and dx_root
belongs to same alloc slot.
> <snip>
>> +static int dir_trailer_func(ocfs2_filesys *fs,
>> + uint64_t blkno,
>> + uint64_t bcount,
>> + uint16_t ext_flags,
>> + void *priv_data)
>> +{
>> + struct trailer_ctxt *ctxt = (struct trailer_ctxt *)priv_data;
>> + struct ocfs2_dinode *di = ctxt->di;
>> + struct ocfs2_dx_root_block *dx_root = ctxt->dx_root;
>> + struct ocfs2_dir_block_trailer *trailer;
>> + int max_rec_len = 0;
>> + errcode_t ret = 0;
>> + char *blk = NULL;
>> +
>> + ret = ocfs2_malloc_block(fs->fs_io, &blk);
>> + if (ret)
>> + goto out;
>> +
>> + /* here we don't trust trailer, cannot use
>> + * ocfs2_read_dir_block() */
>> + ret = ocfs2_read_blocks(fs, blkno, 1, blk);
>> + if (ret)
>> + goto out;
>> + ret = ocfs2_check_dir_trailer_space(fs, di, blkno, blk);
>> + if (ret)
>> + goto out;
>> + ocfs2_init_dir_trailer(fs, di, blkno, blk);
>> + max_rec_len = ocfs2_find_max_rec_len(fs, blk);
>> + trailer = ocfs2_dir_trailer_from_block(fs, blk);
>> + trailer->db_free_rec_len = max_rec_len;
>> +
>> + if (max_rec_len) {
>> + trailer->db_free_next = dx_root->dr_free_blk;
>> + dx_root->dr_free_blk = blkno;
>> + }
>> +
>> + /* comput trailer->db_check here, after writes out,
>> + * trailer is trustable */
>> + ret = ocfs2_write_dir_block(fs, di, blkno, blk);
> I am just curious about one thing. We have added the trailer to the end
> of the dir block. Do we need to update the rec_len of the last record of
> the dir block? Or am I missing something here?
The rec_len is not updated here.
So far, the code assumes: if kernel or user space tool creates a new directory, and trailer is supported, kernel or the
user space tool should make sure the rec_len is correct.
In user space tools, only tunefs.ocfs2 operates on trailer stuffs when enable or disable trailer related features
(metaecc, indexed-dirs). tunefs_install_dir_trailer() will make sure, all valid dir entries will not overlapped with the
trailer area. If the overlap happens, tunefs_install_dir_trailer()->run_dirblocks()->fixup_dirblock() finally updates
the rec_len,
static errcode_t fixup_dirblock(ocfs2_filesys *fs,
[snip]
/*
* Now that we've moved any dirents out of the way, we need to
* fix up db_last and install the trailer.
*/
offset = ((char *)db->db_last) - db->db_buf;
[snip]
Therefore, I trust rec_len is correct.
>> +static errcode_t ocfs2_read_dx_leaves(ocfs2_filesys *fs,
>> + uint64_t start,
>> + int num,
>> + struct ocfs2_dx_leaf **dx_leaves)
>> +{
>> + errcode_t ret;
>> + int i;
>> + struct ocfs2_dx_leaf *dx_leaf;
>> + for (i = 0; i < num; i++) {
>> + assert(!dx_leaves[i]);
>> + ret = ocfs2_malloc_block(fs->fs_io, (char **)&dx_leaf);
>> + if (ret)
>> + goto bail;
>> + ret = ocfs2_read_dx_leaf(fs, start + i, (char *)dx_leaf);
>> + if (ret)
>> + goto bail;
>> + dx_leaves[i] = dx_leaf;
>> + }
>> + goto out;
>> +
>> +bail:
>> + for (; i >= 0; i--) {
>> + if (dx_leaves[i])
>> + ocfs2_free(&dx_leaves[i]);
> We need to set dx_leaves[i] to NULL here. Otherwise the callers of
> ocfs2_read_dx_leaves will double free them. At least I see the caller
> ocfs2_dx_dir_rebalance call ocfs2_dx_dir_free_leaves which free
> dx_leaves[i] again.
ocfs2_free() frees dx_leaves[i] and sets dx_leaves[i] to NULL as well. So we don't need to worry here.
Tao,
Thanks for your review, for this big one :-)
--
Coly Li
SuSE Labs
More information about the Ocfs2-tools-devel
mailing list