[Ocfs2-devel] [PATCH] ocfs2: submit another bio if current bio is full

Changwei Ge gechangwei at live.cn
Mon May 14 00:05:14 PDT 2018


Hi Jun,

IMO, o2hb_issue_node_write() just needs _one_ vec so there is no problem 
in it.

Calculating bio number might be get related code more complicated which 
I think is not necessary.

It doesn't solve any existed problem or improve performance or clean 
code. :(

Thanks,
Changwei

On 2018/5/14 14:26, piaojun wrote:
> Hi Changwei,
>
> I got your point, we should let the caller retry if bio is not enough,
> right? But some caller like o2hb_issue_node_write() won't retry once error
> happens, though the bio will always be enough. I think if we could
> calculate the number of bio we need before calling bio_add_page()?
>
> Thanks
> Jun
>
> On 2018/5/14 11:21, Changwei Ge wrote:
>> Hi Jun,
>>
>> Right now, I am afraid that the easiest and fasted way to fix this issue
>> is to revert your patch.
>>
>>   From comments before function bio_add_page(), we can see that it only
>> fails if either ::bi_vcnt == ::bi_max_vecs or it's a cloned bio.
>>
>>
>> So we can judge if bio is full from its return value is zero or not.
>>
>>
>> Thanks,
>>
>> Changwei
>>
>>
>> On 2018/5/10 9:13, Changwei Ge wrote:
>>>
>>> On 2018/5/10 8:24, piaojun wrote:
>>>> On 2018/5/9 20:01, Changwei Ge wrote:
>>>>> Hi Jun,
>>>>>
>>>>>
>>>>> On 2018/5/9 18:08, piaojun wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> On 2018/4/13 13:51, Changwei Ge wrote:
>>>>>>> If cluster scale exceeds 16 nodes, bio will be full and
>>>>>>> bio_add_page()
>>>>>>> returns 0 when adding pages to bio. Returning -EIO to
>>>>>>> o2hb_read_slots()
>>>>>>> from o2hb_setup_one_bio() will lead to losing chance to allocate more
>>>>>>> bios to present all heartbeat region.
>>>>>>>
>>>>>>> So o2hb_read_slots() fails.
>>>>>>>
>>>>>>> In my test, making fs fails in starting o2cb service.
>>>>>>>
>>>>>>> Attach error log:
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 0, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 1, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 2, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 3, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 4, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 5, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 6, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 7, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 8, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 9, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 10, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 11, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 12, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 13, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 14, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 15, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:463 page 16, vec_len =
>>>>>>> 4096, vec_start = 0
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_setup_one_bio:471 ERROR: Adding page[16]
>>>>>>> to bio failed, page ffffea0002d7ed40, len 0, vec_len 4096,
>>>>>>> vec_start 0, bi_sector 8192
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_read_slots:500 ERROR: status = -5
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_populate_slot_data:1911 ERROR: status = -5
>>>>>>> (mkfs.ocfs2,27479,2):o2hb_region_dev_write:2012 ERROR: status = -5
>>>>>>>
>>>>>>> Fixes: ba16ddfbeb9d ("ocfs2/o2hb: check len for bio_add_page() to
>>>>>>> avoid getting incorrect bio"
>>>>>>>
>>>>>>> Signed-off-by: Changwei Ge <ge.changwei at h3c.com>
>>>>>>> ---
>>>>>>>     fs/ocfs2/cluster/heartbeat.c | 8 ++++++--
>>>>>>>     1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/ocfs2/cluster/heartbeat.c
>>>>>>> b/fs/ocfs2/cluster/heartbeat.c
>>>>>>> index 91a8889abf9b..2809e29d612d 100644
>>>>>>> --- a/fs/ocfs2/cluster/heartbeat.c
>>>>>>> +++ b/fs/ocfs2/cluster/heartbeat.c
>>>>>>> @@ -540,11 +540,12 @@ static struct bio *o2hb_setup_one_bio(struct
>>>>>>> o2hb_region *reg,
>>>>>>>         struct bio *bio;
>>>>>>>         struct page *page;
>>>>>>>     +#define O2HB_BIO_VECS 16
>>>>>>>         /* Testing has shown this allocation to take long enough under
>>>>>>>          * GFP_KERNEL that the local node can get fenced. It would be
>>>>>>>          * nicest if we could pre-allocate these bios and avoid this
>>>>>>>          * all together. */
>>>>>>> -    bio = bio_alloc(GFP_ATOMIC, 16);
>>>>>>> +    bio = bio_alloc(GFP_ATOMIC, O2HB_BIO_VECS);
>>>>>>>         if (!bio) {
>>>>>>>             mlog(ML_ERROR, "Could not alloc slots BIO!\n");
>>>>>>>             bio = ERR_PTR(-ENOMEM);
>>>>>>> @@ -570,7 +571,10 @@ static struct bio *o2hb_setup_one_bio(struct
>>>>>>> o2hb_region *reg,
>>>>>>>                  current_page, vec_len, vec_start);
>>>>>> Should we check the validity of 'current_page' before
>>>>>> bio_add_page()? And
>>>>>> that will prevent error happen. Others looks OK.
>>>>> If I understand correctly, you mean we should check current page  is
>>>>> NULL or not?
>>>>> If so I think there is no need since o2hb should guarantee that it has
>>>>> already reserved enough pages for disk heartbeat read/write behalf.
>>>> I mean we could check if 'current_page' equals O2HB_BIO_VECS before
>>>> bio_add_page() to avoid NULL pointer referrence.
>>> Yes, that might work.
>>> I find another problem within this patch.
>>> I will post v2 patch later to fix them all with consideration about
>>> your suggestion.
>>>
>>> Thanks,
>>> Changwei
>>>
>>>> thanks,
>>>> Jun
>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>> thanks,
>>>>>> Jun
>>>>>>>             len = bio_add_page(bio, page, vec_len, vec_start);
>>>>>>> -        if (len != vec_len) {
>>>>>>> +        if (len == 0 && current_page == O2HB_BIO_VECS) {
>>>>>>> +            /* bio is full now. */
>>>>>>> +            goto bail;
>>>>>>> +        } else if (len != vec_len) {
>>>>>>>                 mlog(ML_ERROR, "Adding page[%d] to bio failed, "
>>>>>>>                      "page %p, len %d, vec_len %u, vec_start %u, "
>>>>>>>                      "bi_sector %llu\n", current_page, page, len,
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Ocfs2-devel mailing list
>>>>>> Ocfs2-devel at oss.oracle.com
>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel



More information about the Ocfs2-devel mailing list