[Ocfs2-devel] [PATCH] ocfs2: improve recovery performance
Joseph Qi
joseph.qi at huawei.com
Fri Jun 17 01:32:40 PDT 2016
On 2016/6/17 15:50, Junxiao Bi wrote:
> Hi Joseph,
>
> On 06/17/2016 03:44 PM, Joseph Qi wrote:
>> Hi Junxiao,
>>
>> On 2016/6/17 14:10, Junxiao Bi wrote:
>>> Journal replay will be run when do recovery for a dead node,
>>> to avoid the stale cache impact, all blocks of dead node's
>>> journal inode were reload from disk. This hurts the performance,
>>> check whether one block is cached before reload it can improve
>>> a lot performance. In my test env, the time doing recovery was
>>> improved from 120s to 1s.
>>>
>>> Signed-off-by: Junxiao Bi <junxiao.bi at oracle.com>
>>> ---
>>> fs/ocfs2/journal.c | 41 ++++++++++++++++++++++-------------------
>>> 1 file changed, 22 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>> index e607419cdfa4..8b808afd5f82 100644
>>> --- a/fs/ocfs2/journal.c
>>> +++ b/fs/ocfs2/journal.c
>>> @@ -1159,10 +1159,8 @@ static int ocfs2_force_read_journal(struct inode *inode)
>>> int status = 0;
>>> int i;
>>> u64 v_blkno, p_blkno, p_blocks, num_blocks;
>>> -#define CONCURRENT_JOURNAL_FILL 32ULL
>>> - struct buffer_head *bhs[CONCURRENT_JOURNAL_FILL];
>>> -
>>> - memset(bhs, 0, sizeof(struct buffer_head *) * CONCURRENT_JOURNAL_FILL);
>>> + struct buffer_head *bhs[1] = {NULL};
>> Since now we do not need batch load, how about make the logic like:
>>
>> struct buffer_head *bh = NULL;
>> ...
>> ocfs2_read_blocks_sync(osb, p_blkno, 1, &bh);
> This array is used because ocfs2_read_blocks_sync() needs it as last
> parameter.
IC, so we pass &bh like ocfs2_read_locked_inode.
Thanks,
Joseph
>
> Thanks,
> Junxiao.
>>
>> Thanks,
>> Joseph
>>
>>> + struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>>
>>> num_blocks = ocfs2_blocks_for_bytes(inode->i_sb, i_size_read(inode));
>>> v_blkno = 0;
>>> @@ -1174,29 +1172,34 @@ static int ocfs2_force_read_journal(struct inode *inode)
>>> goto bail;
>>> }
>>>
>>> - if (p_blocks > CONCURRENT_JOURNAL_FILL)
>>> - p_blocks = CONCURRENT_JOURNAL_FILL;
>>> + for (i = 0; i < p_blocks; i++) {
>>> + bhs[0] = __find_get_block(osb->sb->s_bdev, p_blkno,
>>> + osb->sb->s_blocksize);
>>> + /* block not cached. */
>>> + if (!bhs[0]) {
>>> + p_blkno++;
>>> + continue;
>>> + }
>>>
>>> - /* We are reading journal data which should not
>>> - * be put in the uptodate cache */
>>> - status = ocfs2_read_blocks_sync(OCFS2_SB(inode->i_sb),
>>> - p_blkno, p_blocks, bhs);
>>> - if (status < 0) {
>>> - mlog_errno(status);
>>> - goto bail;
>>> - }
>>> + brelse(bhs[0]);
>>> + bhs[0] = NULL;
>>> + /* We are reading journal data which should not
>>> + * be put in the uptodate cache.
>>> + */
>>> + status = ocfs2_read_blocks_sync(osb, p_blkno, 1, bhs);
>>> + if (status < 0) {
>>> + mlog_errno(status);
>>> + goto bail;
>>> + }
>>>
>>> - for(i = 0; i < p_blocks; i++) {
>>> - brelse(bhs[i]);
>>> - bhs[i] = NULL;
>>> + brelse(bhs[0]);
>>> + bhs[0] = NULL;
>>> }
>>>
>>> v_blkno += p_blocks;
>>> }
>>>
>>> bail:
>>> - for(i = 0; i < CONCURRENT_JOURNAL_FILL; i++)
>>> - brelse(bhs[i]);
>>> return status;
>>> }
>>>
>>>
>>
>>
>
>
> .
>
More information about the Ocfs2-devel
mailing list