[Ocfs2-users] Private Interconnect and self fencing
Sunil Mushran
Sunil.Mushran at oracle.com
Fri Jul 28 13:39:54 CDT 2006
Do you have a netdump server configured? If so, it'll have the details
of the hb timeout.
Jeffery P. Humes wrote:
> I have set it to 30 seconds, and the same thing still happens.
>
> (15,1):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to device
> etherd/e0.1p1 after 30000 milli
> seconds
> panic+0x3e/0x174 (15,1):o2hb_stop_all_regions:1789 ERROR: stopping
> heartbeat on all active regions.
> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
> system by panicing
>
> [<c01233de>] [<f8cc826a>] o2quo_disk_timeout+0x0/0x2 [ocfs2_nodemanager]
> [<c01313f8>] run_workqueue+0x7f/0xba [<f8cc6b15>]
> o2hb_write_timeout+0x0/0x65 [ocfs2_nodemanager]
> [<c0131be5>] worker_thread+0x0/0x117 [<c0131ccb>]
> worker_thread+0xe6/0x117
> [<c011daa9>] default_wake_function+0x0/0xc [<c01344fd>]
> kthread+0x9d/0xc9
> [<c0134460>] kthread+0x0/0xc9 [<c0102005>]
> kernel_thread_helper+0x5/0xb
>
> -JPH
>
>
> Sunil Mushran wrote:
>> The 12 sec default is low. Bump it up to 30 secs or even higher. FAQ
>> has the details.
>> The higher you set it to, the longer the brown-out time.
>>
>> Jeffery P. Humes wrote:
>>> I have an OCFS2 filesystem on a coraid AOE device.
>>> It mounts fine, but with heavy I/O the server self fences claiming a
>>> write timeout:
>>>
>>> (16,2):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to
>>> device etherd/e0.1p1 after 12000 milliseconds
>>> (16,2):o2hb_stop_all_regions:1789 ERROR: stopping heartbeat on all
>>> active regions.
>>> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
>>> system by panicing
>>>
>>> It is my understanding that OCFS is expecting that the only
>>> heartbeat available to be on disk the same disk that I am writing to?
>>>
>>> Is there any way like with other clustering setups to setup a
>>> different or even multiple heartbeats? On a crossover between
>>> servers, or on a private interface?
>>> Seems like putting it only on the disk, that may have heavy IO is
>>> going to cause problems.
>>>
>>> Any advice on setting up the heartbeats would be greatly appreciated.
>>>
>>> Thanks,
>>>
>>> -JPH
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
More information about the Ocfs2-users
mailing list