[Ocfs2-tools-devel] questions for dx_dirs

Mark Fasheh mfasheh at suse.com
Wed Sep 30 11:29:44 PDT 2009


On Wed, Sep 30, 2009 at 06:13:16PM +0800, Coly Li wrote:
> Hi,
> 
> These days, I am reading the dx_dirs patches, there are some questions I want to
> confirm. If you know the answer, please do not hesitate to reply ;-))

Just FYI that in general terms, the implementation follows the design doc
but there were a few small differences that seem to be hanging out up. I'll
try to outline them now.

Also, FYI the beginning of an implementation is available in the dx_dirs
branch of:

git://oss.oracle.com/home/sourcebo/git/ocfs2-tools.git

or you can view it here:

http://oss.oracle.com/git/?p=ocfs2-tools.git;a=shortlog;h=dx_dirs



> 1) unindexed btree
> The unindexed tree from 'dx_dirs' is a btree of extents, organizes data content
> of the directory file. In leaf node (directory file blocks), the first record in
> the extent is struct ocfs2_dir_block_hdr, the rest records after struct
> ocfs2_dir_block_hdr are all struct ocfs2_dir_entry.

Almost - the only thing you got wrong is that we have a struct
ocfs2_dir_block_trailer at the end of every *block* in the unindexed tree
(you said "extent"). Otherwise, as you noted it looks exactly like the
standard dirent tree that we share with ext*.

So instead of per-cluster or per-extent, this is a per-block structure and
instead of the beginning of the block, we placed it at the end (which made
more sense because then dirents like '.' and '..' can keep their well-known
locations).



> 2) in indexed btree, does idx_free_space[OCFS2_IDIR_FREE_SPACE_HEADS] maintain
> free lists cross all leaf blocks of the btree ? or for each extent, there should
> be a set of free lists ?

So there's two things I want to note here:


Firstly - the free space list helps us search in the *unindexed* tree for a
block with enough free space to hold a dirent. This search is done to speed
up directory insertion - otherwise we'd have to do a linear search of the
unindexed tree for a block to hold our new dirent. Technically, worst-case
is still a linear search but we believe that the common case will be much
faster.

Obviously anything that manipulates dirents (unlink, some rename, etc) needs
to keep the list up to date.


Secondly, the free space list implementation is far simpler than what the
design doc wanted. Instead of an array of list heads
(idx_free_space[OCFS2_IDIR_FREE_SPACE_HEADS]) currently we only maintain a
single list (dr_free_blk). This made the code much less complicated. If an
unindexed block that has space enough for any dirent it's put on the list.
We reserved the space for an array just in case a future version needs a
more complex search.


Just to recap regarding the actual disk structures:

struct ocfs2_dx_root_block -> dr_free_blk
Pointer to the start of the list of free unindexed blocks

struct ocfs2_dir_block_trailer -> db_free_next
Pointer to next record on list. 0 is end of list.


> 3) in unindexed tree, use {db_free_prev, db_free_next} to maintain free slot
> within an extent block; in indexed tree, use ocfs2_dir_block_trailer to maintain
> the free slots within an ocfs2_idir_dirent ?

In the unindexed tree, we use the db_free_next field in the per-block
structure of ocfs2_dir_block_trailer to maintain a list of free blocks.

The indexed tree does not require or have any list of free blocks - each
traversal of the tree is a fast search (since it's indexed) and if the
target block has no free records, it is simply split into two blocks.
	--Mark

--
Mark Fasheh



More information about the Ocfs2-tools-devel mailing list