[Ocfs2-devel] [PATCH 3/3] ocfs2: Optimize inode group allocation by recording last used group.

Tao Ma tao.ma at oracle.com
Tue Jan 6 16:49:24 PST 2009



Mark Fasheh wrote:
> On Fri, Nov 28, 2008 at 06:58:45AM +0800, Tao Ma wrote:
>> In ocfs2, the block group search looks for the "emptiest" group
>> to allocate from. So if the allocator has many equally(or almost
>> equally) empty groups, new block group will tend to get spread
>> out amongst them.
>>
>> We add osb_last_alloc_group in ocfs2_super to record the last used
>> allocation group so that next time we can allocate inode group
>> directly from it.
>> For more details, please see
>> http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
>>
>> Signed-off-by: Tao Ma <tao.ma at oracle.com>
>> ---
>>  fs/ocfs2/ocfs2.h    |    3 +++
>>  fs/ocfs2/suballoc.c |   18 +++++++++++++++++-
>>  fs/ocfs2/super.c    |    1 +
>>  3 files changed, 21 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
>> index 5c77798..a99b53e 100644
>> --- a/fs/ocfs2/ocfs2.h
>> +++ b/fs/ocfs2/ocfs2.h
>> @@ -335,6 +335,9 @@ struct ocfs2_super
>>  	struct ocfs2_node_map		osb_recovering_orphan_dirs;
>>  	unsigned int			*osb_orphan_wipes;
>>  	wait_queue_head_t		osb_wipe_event;
>> +
>> +	/* the group we used to allocate inodes. */
>> +	u64				osb_last_alloc_group;
> 
> Can you give this a name which is more specific to the inode allocators?
> That way we can just add another u64 later for the metadata allocators,
> should we decide to do the same for them.
How about osb_inode_alloc_group?
> 
> 
>>  };
>>  
>>  #define OCFS2_SB(sb)	    ((struct ocfs2_super *)(sb)->s_fs_info)
>> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
>> index 98e32b2..3ab1135 100644
>> --- a/fs/ocfs2/suballoc.c
>> +++ b/fs/ocfs2/suballoc.c
>> @@ -49,6 +49,7 @@
>>  #define NOT_ALLOC_NEW_GROUP		0
>>  #define ALLOC_NEW_GROUP			0x1
>>  #define ALLOC_NEW_GROUP_FROM_GLOBAL	0x2
>> +#define ALLOC_USE_RECORD_GROUP		0x4
>>  
>>  #define OCFS2_MAX_INODES_TO_STEAL	1024
>>  
>> @@ -411,6 +412,11 @@ static int ocfs2_block_group_alloc(struct ocfs2_super *osb,
>>  		goto bail;
>>  	}
>>  
>> +	if (flags & ALLOC_USE_RECORD_GROUP && osb->osb_last_alloc_group) {
>> +		mlog(0, "use old allocation group %llu\n",
>> +		     (unsigned long long)osb->osb_last_alloc_group);
>> +		ac->ac_last_group = osb->osb_last_alloc_group;
>> +	}
>>  	status = ocfs2_claim_clusters(osb,
>>  				      handle,
>>  				      ac,
>> @@ -485,6 +491,15 @@ static int ocfs2_block_group_alloc(struct ocfs2_super *osb,
>>  	alloc_inode->i_blocks = ocfs2_inode_sector_count(alloc_inode);
>>  
>>  	status = 0;
>> +
>> +	if (flags & ALLOC_USE_RECORD_GROUP) {
>> +		spin_lock(&osb->osb_lock);
>> +		osb->osb_last_alloc_group = ac->ac_last_group;
>> +		spin_unlock(&osb->osb_lock);
>> +		mlog(0, "after reservation, new allocation group is "
>> +		     "%llu\n", (unsigned long long)osb->osb_last_alloc_group);
>> +	}
>> +
> 
> How about we instead add a u64 * argument to ocfs2_block_group_alloc() and
> ocfs2_reserve_suballoc_bits(). Inside ocfs2_block_group_alloc, we'd test for
> it to be non-null before using it. That way, any other allocator code which
> wants to use this mechanism only has to modify a function parameter. If
> you'd rather test for the flag instead of non-null, that's ok too so long as
> the end result is the same - new users just have to modify their parameters to
> ocfs2_reserve_suballoc_bits().
yeah, you suggestion make ocfs2_reserve_suballoc_bits more generic. thanks.
> 
> 
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index bc43138..3593759 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -843,6 +843,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>  	osb->osb_commit_interval = parsed_options.commit_interval;
>>  	osb->local_alloc_default_bits = ocfs2_megabytes_to_clusters(sb, parsed_options.localalloc_opt);
>>  	osb->local_alloc_bits = osb->local_alloc_default_bits;
>> +	osb->osb_last_alloc_group = 0;
> 
> Won't this already be zero'd for us by ocfs2_initialize_super()?
Yeah, thanks for it.

Regards,
Tao



More information about the Ocfs2-devel mailing list