[Ocfs2-users] AoE+ocfs2 = Heartbeat write timeout to device

b52 at entrap.de b52 at entrap.de
Sat Mar 8 15:32:14 PST 2008


> Sunil,
>
>     Can I configure this heartbeat to use a high priority (realtime)
> schedulling?
>
>      If I simply increase the timeout it still could timeout on heavy I/O
> situations, like several different threads queuing large amounts of
> writes. The kernel should know this is a high priority write so that
> it is put ahead of the queue.
>
> Regards,
> Luis
>
> Sunil Mushran <Sunil.Mushran at oracle.com> wrote: The older 12 sec default
> timeout was too low. It has been bumped
> up to 60 secs. The FAQ has details on this.

I searched for that error already and it seems I misunderstood the timeout
stuff in the faq. But changing the timeouts is a workaround not a
solution. The heartbeat I/O should be prioritized.

Anyway it is a workaround and I will give it a try, thank you very much.

Cheers,
Holger

>
> b52 at entrap.de wrote:
>> Hi,
>>
>> I got a problem regarding 100Mbit Ethernet, AoE and ocfs2. I setup 2
>> boxes
>> connected per 100Mbit ethernet to their Ata-over-Ethernet storage. The
>> ocfs filesystem resides on such an AoE-Partition. If I produce high
>> troughput to that ocfs-partition on one node, it reboots after some
>> seconds.
>>
>> I use dd for testing, like dd if=/dev/zero of=test bs=1M count=1000
>> If I write 100Mb of data to the disk everything is fine. If I write 1Gb
>> of
>> data to the disk, the node reboots after some seconds and prints the
>> following error:
>>
>> (9,0):o2hb_write_timeout:167 ERROR: Heartbeat write timeout to device
>> etherd/e402.0 after 12000 milliseconds
>> (9,0):o2hb_stop_all_regions:1865 ERROR: stopping heartbeat on all active
>> regions.
>>
>> This couldn't be caused by lost heartbeat packets. I setup a seperate
>> network for heartbeat to track this problem.
>>
>> Actually I know that 100Mbit Ethernet is a bottleneck, but this should
>> not
>> cause the system to reboot, right? Even if I could switch to Gigbit
>> Ethernet it may be the bottleneck in future..
>>
>> Someone experienced this already? Do you know how to solve this issue?
>> Please help, I need to do some tests..
>> Your help is really appreciated.
>>
>> Cheers,
>> Holger
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>
> ---------------------------------
> Never miss a thing.   Make Yahoo your
> homepage._______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users





More information about the Ocfs2-users mailing list