[Ocfs2-devel] [PATCH v2] ocfs2: Cache system inodes of other slots.

Tao Ma tao.ma at oracle.com
Sun Aug 15 22:14:48 PDT 2010


Hi wengang,

On 08/16/2010 12:55 PM, Wengang Wang wrote:
> Hi tao,
>
> On 10-08-16 10:31, Tao Ma wrote:
>> Durring orphan scan, if we are slot 0, and we are replaying
>> orphan_dir:0001, the general process is that for every file
>> in this dir:
>> 1. we will iget orphan_dir:0001, since there is no inode for it.
>>     we will have to create an inode and read it from the disk.
>> 2. do the normal work, such as delete_inode and remove it from
>>     the dir if it is allowed.
>> 3. call iput orphan_dir:0001 when we are done. In this case,
>>     since we have no dcache for this inode, i_count will
>>     reach 0, and VFS will have to call clear_inode and in
>>     ocfs2_clear_inode we will checkpoint the inode which will let
>>     ocfs2_cmt and journald begin to work.
>> 4. We loop back to 1 for the next file.
>>
>> So you see, actually for every deleted file, we have to read the
>> orphan dir from the disk and checkpoint the journal. It is very
>> time consuming and cause a lot of journal checkpoint I/O.
>> A better solution is that we can have another reference for these
>> inodes in ocfs2_super. So if there is no other race among
>> nodes(which will let dlmglue to checkpoint the inode), for step 3,
>> clear_inode won't be called and for step 1, we may only need to
>> read the inode for the 1st time. This is a big win for us.
>>
>> So this patch will try to cache system inodes of other slots so
>> that we will have one more reference for these inodes and avoid
>> the extra inode read and journal checkpoint.
>>
>> Signed-off-by: Tao Ma<tao.ma at oracle.com>
>> -					   u32 slot)
>> +static struct inode **get_local_system_inode(struct ocfs2_super *osb,
>> +					     int type,
>> +					     u32 slot)
>>   {
>> -	return slot == osb->slot_num || is_global_system_inode(type);
>> +	int index;
>> +
>> +	BUG_ON(slot == OCFS2_INVALID_SLOT);
>> +	BUG_ON(type<  OCFS2_FIRST_LOCAL_SYSTEM_INODE ||
>> +	       type>  OCFS2_LAST_LOCAL_SYSTEM_INODE);
>> +
>> +	if (unlikely(!osb->local_system_inodes)) {
>> +		osb->local_system_inodes = kzalloc(sizeof(struct inode *) *
>> +						   NUM_LOCAL_SYSTEM_INODES *
>> +						   osb->max_slots,
>> +						   GFP_NOFS);
>> +		if (!osb->local_system_inodes) {
>> +			mlog_errno(-ENOMEM);
>> +			/*
>> +			 * return NULL here so that ocfs2_get_sytem_file_inodes
>> +			 * will try to create an inode and use it. We will try
>> +			 * to initialize local_system_inodes next time.
>> +			 */
>> +			return NULL;
>> +		}
>> +	}
>> +
>
> Here, it's possible that get_local_system_inode() runs in parallel.
> Since setting local_system_inodes is not protected, there be a memory leak.
You are right. I will update it. Thanks.

Regards,
Tao



More information about the Ocfs2-devel mailing list