[Ocfs2-users] Auto reboot when running fio benchmarking

Tue Jan 12 20:15:20 PST 2016

Hi Nguyen,
If the problem can be resolved by enlarging O2CB_HEARTBEAT_THRESHOLD,
it may be caused by heavy load in your fio performance test.
Ocfs2 should do disk heartbeat to indicate "I'm alive". Once heartbeat
timeout it will fence the node, which default behavior is restarting.

Thanks,
Joseph

On 2016/1/13 11:45, Nguyen Xuan. Hai wrote:
> Hi,
> 
> Thank you for your support.
> I resolved the problem by increasing value of O2CB_HEARTBEAT_THRESHOLD 
> in file "/etc/default/o2cb".
> In default, O2CB_HEARTBEAT_THRESHOLD = 31. I changed it to 601 and the 
> benchmarking is successful without rebooting.
> But I am wondering what the root cause is.
> 
> Brs,
> Hai Nguyen
> 
> On 12/16/2015 12:15 PM, Eric Ren wrote:
>> Hi,
>>
>> On Wed, Dec 16, 2015 at 10:47:05AM +0700, Nguyen Xuan. Hai wrote:
>>> Hi Eric,
>>>
>>> I greatly appreciate your support.
>>>
>>> About md0 (raid0 device), I think it is not the cause. Because currently
>>> I do not use md0 to run benchmarking. I am using a normal disk partition
>>> with LVM and OCFS2 file system. Additional, I tried to run benchmarking
>>> in other machines (these machines have not raid device) but the
>>> phenomenon is the same.
>> You cannot use another problem to prove something;-) Please take look at cLVM,
>> and refer to [1],[2].
>>
>> Use case matters! OCFS2 could be used in serveval different scenarios.
>> As for your case, running ocfs2 on lvm partition is well supported with pcmk.
>> But with o2cb, theoritically, you may setup cLVM manually, then run ocfs2 on it,
>> but may be difficult, I have not seen any doc about this usage.
>>
>> [1] https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_clvm_config.html
>> [2] https://www.suse.com/documentation/sle_ha/book_sleha/data/cha_ha_config_example.html
>>
>> Thanks,
>> Eric
>>> Thanks,
>>> Hai Nguyen
>>>
>>> On 12/16/2015 10:26 AM, Eric Ren wrote:
>>>> Hi,
>>>>
>>>> On Tue, Dec 15, 2015 at 04:20:46PM +0700, Nguyen Xuan. Hai wrote:
>>>>> Hi Eric,
>>>>>
>>>>> 1. I am using o2cb cluster stack.
>>>> I'm not familiar with o2cb, so I've CCed Joseph, maybe he could
>>>> give some help.
>>>>> 2. The scenarios led to  reboot: Randomly writing with Fixed file
>>>>> size. This is an example of these scenarios:
>>>>>
>>>>> [global]
>>>>> directory=/mnt/fio4G
>>>>> filename=fio_data
>>>>> invalidate=1
>>>>> ioengine=libaio
>>>>> direct=1
>>>>> ;ramp_time=30
>>>>> iodepth=1
>>>>>
>>>>> [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>> new_group
>>>>> rw=randwrite
>>>>> bs=512
>>>>> size=4g
>>>>> numjobs=1
>>>>> group_reporting
>>>>>
>>>>> [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>> new_group
>>>>> rw=randwrite
>>>>> bs=4k
>>>>> size=4g
>>>>> numjobs=1
>>>>> group_reporting
>>>>>
>>>>> [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>> new_group
>>>>> rw=randwrite
>>>>> bs=64k
>>>>> size=4g
>>>>> numjobs=1
>>>>> group_reporting
>>>>>
>>>>> [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>> new_group
>>>>> rw=randwrite
>>>>> bs=1m
>>>>> size=4g
>>>>> numjobs=1
>>>>> group_reporting
>>>> Well, we tested ocfs2 by iozone and fio with pcmk stack, and randomly
>>>> write is OK. Actually, I'm intersted at what you expect ocfs2 do for you;-)
>>>>> 3. I've attached the log files (kernel log, system log, message
>>>>> log). Please take a look.
>>>> Thanks. I only scaned the kernel log, and pick up unusual messages here:
>>>> ---
>>>>    8470 Dec  9 03:25:06 skerlet kernel: [    3.835795] md: md0 stopped.
>>>>    8471 Dec  9 03:25:06 skerlet kernel: [    3.836408] md: bind<sda7>
>>>>    8472 Dec  9 03:25:06 skerlet kernel: [    3.837104] md: bind<sdb7>
>>>>    8473 Dec  9 03:25:06 skerlet kernel: [    3.837742] md: raid0 personality registered for level 0
>>>> ...
>>>>    8531 Dec  9 03:25:06 skerlet kernel: [   11.507721] OCFS2 Node Manager 1.5.0
>>>>    8532 Dec  9 03:25:06 skerlet kernel: [   11.550409] OCFS2 DLM 1.5.0
>>>>    8533 Dec  9 03:25:06 skerlet kernel: [   11.554791] ocfs2: Registered cluster interface o2cb
>>>>    8534 Dec  9 03:25:06 skerlet kernel: [   11.565655] OCFS2 DLMFS 1.5.0
>>>>    8535 Dec  9 03:25:06 skerlet kernel: [   11.565778] OCFS2 User DLM kernel interface loaded
>>>>    8536 Dec  9 03:25:06 skerlet kernel: [   12.205284] fuse init (API version 7.18)
>>>>    8537 Dec  9 03:25:06 skerlet kernel: [   13.198880] RPC: Registered named UNIX socket transport module.
>>>>    8538 Dec  9 03:25:06 skerlet kernel: [   13.198883] RPC: Registered udp transport module.
>>>>    8539 Dec  9 03:25:06 skerlet kernel: [   13.198884] RPC: Registered tcp transport module.
>>>>    8540 Dec  9 03:25:06 skerlet kernel: [   13.198885] RPC: Registered tcp NFSv4.1 backchannel transport module.
>>>>    8541 Dec  9 03:25:06 skerlet kernel: [   13.473735] Installing knfsd (copyright (C) 1996 okir at monad.swb.de).
>>>>    8542 Dec  9 03:25:07 skerlet kernel: [   13.626296] svc: failed to register lockdv1 RPC service (errno 97).
>>>>    8543 Dec  9 03:25:07 skerlet kernel: [   13.626382] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
>>>>    8544 Dec  9 03:25:07 skerlet kernel: [   13.643385] NFSD: starting 90-second grace period
>>>>    8545 Dec  9 03:25:08 skerlet kernel: [   14.760868] r8169 0000:02:00.0: eth0: link up
>>>>    8546 Dec  9 03:25:08 skerlet kernel: [   14.761229] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>>>    8547 Dec  9 03:25:10 skerlet kernel: [   17.022745] sshd (1835): /proc/1835/oom_adj is deprecated, please use /proc/1835/oom_score_adj instead.
>>>>    8548 Dec  9 03:25:18 skerlet kernel: [   25.293365] eth0: no IPv6 routers present
>>>>    8549 Dec  9 03:26:03 skerlet kernel: [   70.179116] OCFS2 1.5.0
>>>>    8550 Dec  9 03:26:03 skerlet kernel: [   70.259140] o2dlm: Joining domain 7CA3345642C24049B6CE5DE419528F3C ( 1 ) 1 nodes
>>>>    8551 Dec  9 03:26:03 skerlet kernel: [   70.259697] ocfs2: Slot 0 on device (253,0) was already allocated to this node!
>>>>    8552 Dec  9 03:26:03 skerlet kernel: [   70.262011] ocfs2: File system on device (253,0) was not unmounted cleanly, recovering it.
>>>>    8553 Dec  9 03:26:03 skerlet kernel: [   70.275936] ocfs2: Mounting device (253,0) on (node 1, slot 0) with ordered data mode.
>>>>    8554 Dec  9 03:37:27 skerlet kernel: [  752.501739] o2dlm: Leaving domain 7CA3345642C24049B6CE5DE419528F3C
>>>>    8555 Dec  9 03:37:28 skerlet kernel: [  753.803611] ocfs2: Unmounting device (253,0) on (node 1)
>>>>    8556 Dec  9 03:38:08 skerlet kernel: [  792.852741] (o2hb-1BF749BE01,2185,0):o2hb_check_own_slot:590 ERROR: Another node is heartbeating on device (md0): expected(1:0x0, 0x313045444f4e49), ondisk(59:0x0, 0x313045444f4e49)
>>>>    8557 Dec  9 03:38:16 skerlet kernel: [  800.860197] o2dlm: Joining domain 1BF749BE01CC449189BB493D549B8D31 ( 1 ) 1 nodes
>>>>    8558 Dec  9 03:38:16 skerlet kernel: [  800.862930] JBD2: no valid journal superblock found
>>>>    8559 Dec  9 03:38:16 skerlet kernel: [  800.862936] (mount.ocfs2,2184,0):ocfs2_journal_wipe:1045 ERROR: status = -22
>>>>    8560 Dec  9 03:38:16 skerlet kernel: [  800.862940] (mount.ocfs2,2184,0):ocfs2_check_volume:2465 ERROR: status = -22
>>>>    8561 Dec  9 03:38:16 skerlet kernel: [  800.862943] (mount.ocfs2,2184,0):ocfs2_check_volume:2527 ERROR: status = -22
>>>>    8562 Dec  9 03:38:16 skerlet kernel: [  800.862946] (mount.ocfs2,2184,0):ocfs2_mount_volume:1903 ERROR: status = -22
>>>>    8563 Dec  9 03:38:20 skerlet kernel: [  804.894651] o2dlm: Leaving domain 1BF749BE01CC449189BB493D549B8D31
>>>>    8564 Dec  9 03:38:20 skerlet kernel: [  804.894865] ocfs2: Unmounting device (9,0) on (node 1)
>>>>    8565 Dec  9 03:38:20 skerlet kernel: [  804.894871] (mount.ocfs2,2184,2):ocfs2_fill_super:1230 ERROR: status = -22
>>>>    8566 Dec  9 03:38:37 skerlet kernel: [  822.242693] o2dlm: Joining domain 1BF749BE01CC449189BB493D549B8D31 ( 1 ) 1 nodes
>>>>    8567 Dec  9 03:38:41 skerlet kernel: [  826.278128] o2dlm: Leaving domain 1BF749BE01CC449189BB493D549B8D31
>>>>    8568 Dec  9 03:45:55 skerlet kernel: [ 1259.696919] o2dlm: Joining domain BD60713F60434B938DD07321526907DE ( 1 ) 1 nodes
>>>>    8569 Dec  9 03:45:55 skerlet kernel: [ 1259.707796] JBD2: Ignoring recovery information on journal
>>>>    8570 Dec  9 03:45:55 skerlet kernel: [ 1259.715268] ocfs2: Mounting device (9,0) on (node 1, slot 0) with ordered data mode.
>>>>    8571 Dec  9 03:48:10 skerlet kernel: [ 1394.248484] o2dlm: Joining domain 7CA3345642C24049B6CE5DE419528F3C ( 1 ) 1 nodes
>>>>    8572 Dec  9 03:48:10 skerlet kernel: [ 1394.257546] ocfs2: Mounting device (253,0) on (node 1, slot 0) with ordered data mode.
>>>>    8573 Dec  9 05:47:53 skerlet kernel: [ 8560.370451] o2dlm: Leaving domain 7CA3345642C24049B6CE5DE419528F3C
>>>>    8574 Dec  9 05:47:54 skerlet kernel: [ 8561.615904] ocfs2: Unmounting device (253,0) on (node 1)
>>>>    8575 Dec  9 05:58:55 skerlet kernel: [ 9221.153369] INFO: task flush-9:0:2319 blocked for more than 120 seconds.
>>>>    8576 Dec  9 05:58:55 skerlet kernel: [ 9221.153373] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> ...backtrace...
>>>> ---
>>>> 1. device(253,0) was md0 device? if so, "md: md0 stopped" may be the cause.
>>>> 2. I don't think md raid0 can be used as shared disk for ocfs2. If mkfs.ocfs2 with local option, it should be fine.
>>>>      IOW, if you want ocfs2 show its cluster ability, you shouldn't use md raid0 as shared disk, if you do so, it's no
>>>>      surprise that writing case would fail because native md has no clustering ability. Have a try, and let us know.
>>>> 3. AFAIK, a cluster md feature(belong to md) is coming soon to to so.
>>>>
>>>> Thanks,
>>>> Eric
>>>>
>>>>> Thank you so much,
>>>>> Hai Nguyen
>>>>>
>>>>>
>>>>> On 12/15/2015 4:09 PM, Eric Ren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, Dec 03, 2015 at 03:19:52PM +0700, Nguyen Xuan. Hai wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm performing benchmarking on OCFS2 file system on LVM using fio tool.
>>>>>>> There are some scenarios that when we run it, after few minutes,
>>>>>>> computer will reboot automatically. These scenarios are related to
>>>>>>> OCFS2 file system only (there is no problem with ext3).
>>>>>>>
>>>>>>> We tried to fix by adding option "--debug" in fio command
>>>>>>> (example: /fio RandWR-ASync-IOdepth1-FixFileSize
>>>>>>> --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/).
>>>>>>> Some scenarios can run successfully without rebooting. But there are
>>>>>>> still some scenarios cannot run successfully.
>>>>>> Sorry for late reply. Could you provide more information? such as
>>>>>> 1. which cluster stack were you using, o2cb or pcmk? if pcmk, ocfs2 RA monitor timeout
>>>>>>      will triger fencing - reboot. I have not experienced rebooting when using o2cb, and am
>>>>>>      wondering if o2cb has similiar fencing mechianism. Maybe, kernel panic also incurs
>>>>>>      rebooting sometimes.
>>>>>> 2. What scenarios led to reboot?
>>>>>> 3. all logs: kernel logs, pacemaker logs if pcmk.
>>>>>>
>>>>>> Thanks,
>>>>>> Eric
>>>>>>> We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after
>>>>>>> referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html).
>>>>>>> Some scenarios can run successfully without rebooting. But there are
>>>>>>> still some scenarios cannot run successfully.
>>>>>>>
>>>>>>> This is content of file "RandWR-ASync-IOdepth1-FixFileSize":
>>>>>>> [global]
>>>>>>> directory=/mnt/fio4G
>>>>>>> filename=fio_data
>>>>>>> invalidate=1
>>>>>>> ioengine=libaio
>>>>>>> direct=1
>>>>>>> ;ramp_time=30
>>>>>>> iodepth=1
>>>>>>>
>>>>>>> [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=512
>>>>>>> size=4g
>>>>>>> numjobs=1
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=4k
>>>>>>> size=4g
>>>>>>> numjobs=1
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=64k
>>>>>>> size=4g
>>>>>>> numjobs=1
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=1m
>>>>>>> size=4g
>>>>>>> numjobs=1
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=512
>>>>>>> size=1g
>>>>>>> numjobs=4
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=4k
>>>>>>> size=1g
>>>>>>> numjobs=4
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=64k
>>>>>>> size=1g
>>>>>>> numjobs=4
>>>>>>> group_reporting
>>>>>>>
>>>>>>> [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix]
>>>>>>> new_group
>>>>>>> rw=randwrite
>>>>>>> bs=1m
>>>>>>> size=1g
>>>>>>> numjobs=4
>>>>>>> group_reporting
>>>>>>>
>>>>>>> Could you help me find out the reason?
>>>>>>>
>>>>>>> Thanks and Best regards,
>>>>>>>
>>>>>>> -- 
>>>>>>> =====================================================================
>>>>>>> Nguyen Xuan Hai (Mr)
>>>>>>>
>>>>>>> Toshiba Software Development (Vietnam) Co.,Ltd
>>>>>>>
>>>>>>> =====================================================================
>>>>>>>
>>>>>>> -- 
>>>>>>> This mail was scanned by BitDefender
>>>>>>> For more information please visit http://www.bitdefender.com
>>>>>>> _______________________________________________
>>>>>>> Ocfs2-users mailing list
>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>> -- 
>>>>> =====================================================================
>>>>> Nguyen Xuan Hai (Mr)
>>>>>
>>>>> Toshiba Software Development (Vietnam) Co.,Ltd
>>>>>
>>>>> =====================================================================
>>>>>
>>>>> -- 
>>>>> This mail was scanned by BitDefender
>>>>> For more information please visit http://www.bitdefender.com
>>>>> _______________________________________________
>>>>> Ocfs2-users mailing list
>>>>> Ocfs2-users at oss.oracle.com
>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>> -- 
>>> =====================================================================
>>> Nguyen Xuan Hai (Mr)
>>>
>>> Toshiba Software Development (Vietnam) Co.,Ltd
>>> 13th Floor, VIT Tower, 519 Kim Ma street, Ba Dinh District, Hanoi, Vietnam
>>> tel:    +84-4-2220 8801 (Ext. 187)
>>> e-mail: hai.nguyenxuan at toshiba-tsdv.com
>>> ---
>>> Note: This e-mail message may contain personal information or confidential information.
>>> If you are not the addressee of this message, please delete this message and kindly notify
>>> the sender as soon as possible - do not copy, use, or disclose this message.
>>> =====================================================================
>>>
>>>
>>> -- 
>>> This mail was scanned by BitDefender
>>> For more information please visit http://www.bitdefender.com
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>