[Ocfs2-users] Auto reboot when running fio benchmarking

Nguyen Xuan. Hai hai.nguyenxuan at toshiba-tsdv.com
Tue Dec 15 01:20:46 PST 2015


Hi Eric,

1. I am using o2cb cluster stack.
2. The scenarios led to  reboot: Randomly writing with Fixed file size. 
This is an example of these scenarios:

[global]
directory=/mnt/fio4G
filename=fio_data
invalidate=1
ioengine=libaio
direct=1
;ramp_time=30
iodepth=1

[RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix]
new_group
rw=randwrite
bs=512
size=4g
numjobs=1
group_reporting

[RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
new_group
rw=randwrite
bs=4k
size=4g
numjobs=1
group_reporting

[RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
new_group
rw=randwrite
bs=64k
size=4g
numjobs=1
group_reporting

[RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix]
new_group
rw=randwrite
bs=1m
size=4g
numjobs=1
group_reporting

3. I've attached the log files (kernel log, system log, message log). 
Please take a look.

Thank you so much,
Hai Nguyen


On 12/15/2015 4:09 PM, Eric Ren wrote:
> Hi,
>
> On Thu, Dec 03, 2015 at 03:19:52PM +0700, Nguyen Xuan. Hai wrote:
>> Hi all,
>>
>> I'm performing benchmarking on OCFS2 file system on LVM using fio tool.
>> There are some scenarios that when we run it, after few minutes,
>> computer will reboot automatically. These scenarios are related to
>> OCFS2 file system only (there is no problem with ext3).
>>
>> We tried to fix by adding option "--debug" in fio command
>> (example: /fio RandWR-ASync-IOdepth1-FixFileSize
>> --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/).
>> Some scenarios can run successfully without rebooting. But there are
>> still some scenarios cannot run successfully.
> Sorry for late reply. Could you provide more information? such as
> 1. which cluster stack were you using, o2cb or pcmk? if pcmk, ocfs2 RA monitor timeout
>     will triger fencing - reboot. I have not experienced rebooting when using o2cb, and am
>     wondering if o2cb has similiar fencing mechianism. Maybe, kernel panic also incurs
>     rebooting sometimes.
> 2. What scenarios led to reboot?
> 3. all logs: kernel logs, pacemaker logs if pcmk.
>
> Thanks,
> Eric
>> We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after
>> referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html).
>> Some scenarios can run successfully without rebooting. But there are
>> still some scenarios cannot run successfully.
>>
>> This is content of file "RandWR-ASync-IOdepth1-FixFileSize":
>> [global]
>> directory=/mnt/fio4G
>> filename=fio_data
>> invalidate=1
>> ioengine=libaio
>> direct=1
>> ;ramp_time=30
>> iodepth=1
>>
>> [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>> new_group
>> rw=randwrite
>> bs=512
>> size=4g
>> numjobs=1
>> group_reporting
>>
>> [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>> new_group
>> rw=randwrite
>> bs=4k
>> size=4g
>> numjobs=1
>> group_reporting
>>
>> [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>> new_group
>> rw=randwrite
>> bs=64k
>> size=4g
>> numjobs=1
>> group_reporting
>>
>> [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>> new_group
>> rw=randwrite
>> bs=1m
>> size=4g
>> numjobs=1
>> group_reporting
>>
>> [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix]
>> new_group
>> rw=randwrite
>> bs=512
>> size=1g
>> numjobs=4
>> group_reporting
>>
>> [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix]
>> new_group
>> rw=randwrite
>> bs=4k
>> size=1g
>> numjobs=4
>> group_reporting
>>
>> [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix]
>> new_group
>> rw=randwrite
>> bs=64k
>> size=1g
>> numjobs=4
>> group_reporting
>>
>> [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix]
>> new_group
>> rw=randwrite
>> bs=1m
>> size=1g
>> numjobs=4
>> group_reporting
>>
>> Could you help me find out the reason?
>>
>> Thanks and Best regards,
>>
>> -- 
>> =====================================================================
>> Nguyen Xuan Hai (Mr)
>>
>> Toshiba Software Development (Vietnam) Co.,Ltd
>>
>> =====================================================================
>>
>> -- 
>> This mail was scanned by BitDefender
>> For more information please visit http://www.bitdefender.com
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>

-- 
=====================================================================
Nguyen Xuan Hai (Mr)

Toshiba Software Development (Vietnam) Co.,Ltd

=====================================================================

-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_benchmarking.rar
Type: application/octet-stream
Size: 146677 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20151215/69cfcf86/attachment-0001.obj 
-------------- next part --------------
-- 
This mail was scanned by BitDefender
For more information please visit http://www.bitdefender.com


More information about the Ocfs2-users mailing list