[Ocfs2-tools-devel] questions for dx_dirs

Coly Li coly.li at suse.de
Wed Sep 30 17:59:15 PDT 2009



Mark Fasheh Wrote:
> On Wed, Sep 30, 2009 at 06:13:16PM +0800, Coly Li wrote:
>> 1) unindexed btree
>> The unindexed tree from 'dx_dirs' is a btree of extents, organizes data content
>> of the directory file. In leaf node (directory file blocks), the first record in
>> the extent is struct ocfs2_dir_block_hdr, the rest records after struct
>> ocfs2_dir_block_hdr are all struct ocfs2_dir_entry.
> 
> Almost - the only thing you got wrong is that we have a struct
> ocfs2_dir_block_trailer at the end of every *block* in the unindexed tree
> (you said "extent"). Otherwise, as you noted it looks exactly like the
> standard dirent tree that we share with ext*.
> 
> So instead of per-cluster or per-extent, this is a per-block structure and
> instead of the beginning of the block, we placed it at the end (which made
> more sense because then dirents like '.' and '..' can keep their well-known
> locations).
> 

So ocfs2_dir_block_hdr is also a per-block structure (not per-cluster, not
per-extent) ?

And in every block of the unindexed tree, the first record is
ocfs2_dir_block_hdr, and the last record is ocfs2_dir_block_trailer, the rested
records between these 2 records are ocfs2_dir_entry ?

> 
> 
>> 2) in indexed btree, does idx_free_space[OCFS2_IDIR_FREE_SPACE_HEADS] maintain
>> free lists cross all leaf blocks of the btree ? or for each extent, there should
>> be a set of free lists ?
> 
> So there's two things I want to note here:
> 
> Firstly - the free space list helps us search in the *unindexed* tree for a
> block with enough free space to hold a dirent. This search is done to speed
> up directory insertion - otherwise we'd have to do a linear search of the
> unindexed tree for a block to hold our new dirent. Technically, worst-case
> is still a linear search but we believe that the common case will be much
> faster.
> 
> Obviously anything that manipulates dirents (unlink, some rename, etc) needs
> to keep the list up to date.
> 

Make sense.

> 
> Secondly, the free space list implementation is far simpler than what the
> design doc wanted. Instead of an array of list heads
> (idx_free_space[OCFS2_IDIR_FREE_SPACE_HEADS]) currently we only maintain a
> single list (dr_free_blk). This made the code much less complicated. If an
> unindexed block that has space enough for any dirent it's put on the list.
> We reserved the space for an array just in case a future version needs a
> more complex search.
> 
> 

Oh, thanks for pointing out this :-)

> Just to recap regarding the actual disk structures:
> 
> struct ocfs2_dx_root_block -> dr_free_blk
> Pointer to the start of the list of free unindexed blocks
> 
> struct ocfs2_dir_block_trailer -> db_free_next
> Pointer to next record on list. 0 is end of list.
> 
> 

My understanding was,
1) ocfs2_dx_root_block->dr_free_blk maintains free blocks (all ocfs2_dir_entry
records are free)
2) ocfs2_dir_block_trailer->db_free_next maintains the free space inside an
unindexed tree block (the free space comes after an previous dentry gets removed).

After checking the kernel code, I don't find anything support 2). Do I
mis-understand something ?


>> 3) in unindexed tree, use {db_free_prev, db_free_next} to maintain free slot
>> within an extent block; in indexed tree, use ocfs2_dir_block_trailer to maintain
>> the free slots within an ocfs2_idir_dirent ?
> 
> In the unindexed tree, we use the db_free_next field in the per-block
> structure of ocfs2_dir_block_trailer to maintain a list of free blocks.
> 

What I am confused is here :-)

When a dentry gets removed from the directory file, a record is freed within an
unindex tree block. There should be something to maintain this kind of free
space, so next time when allocating space to new dentry, this free space might
be re-used.

> The indexed tree does not require or have any list of free blocks - each
> traversal of the tree is a fast search (since it's indexed) and if the
> target block has no free records, it is simply split into two blocks.

Right now, I am checking the code to find out
1) how the indexed tree is stored on disk.
2) how to do the traversal in the indexed tree.
3) how to insert a new record into the indexed tree.
Still on the way to get the answer.

Thanks for your replying :-)
-- 
Coly Li
SuSE Labs



More information about the Ocfs2-tools-devel mailing list