[Ocfs2-users] Private Interconnect and self fencing

Sunil Mushran Sunil.Mushran at oracle.com
Fri Jul 28 13:39:54 CDT 2006


Do you have a netdump server configured? If so, it'll have the details
of the hb timeout.

Jeffery P. Humes wrote:
> I have set it to 30 seconds, and the same thing still happens.
>
> (15,1):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to device 
> etherd/e0.1p1 after 30000 milli
> seconds
> panic+0x3e/0x174    (15,1):o2hb_stop_all_regions:1789 ERROR: stopping 
> heartbeat on all active regions.
> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this 
> system by panicing
>
> [<c01233de>]  [<f8cc826a>] o2quo_disk_timeout+0x0/0x2 [ocfs2_nodemanager]
> [<c01313f8>] run_workqueue+0x7f/0xba     [<f8cc6b15>] 
> o2hb_write_timeout+0x0/0x65 [ocfs2_nodemanager]
> [<c0131be5>] worker_thread+0x0/0x117     [<c0131ccb>] 
> worker_thread+0xe6/0x117
> [<c011daa9>] default_wake_function+0x0/0xc     [<c01344fd>] 
> kthread+0x9d/0xc9
> [<c0134460>] kthread+0x0/0xc9     [<c0102005>] 
> kernel_thread_helper+0x5/0xb
>
> -JPH
>
>
> Sunil Mushran wrote:
>> The 12 sec default is low. Bump it up to 30 secs or even higher. FAQ 
>> has the details.
>> The higher you set it to, the longer the brown-out time.
>>
>> Jeffery P. Humes wrote:
>>> I have an OCFS2 filesystem on a coraid AOE device.
>>> It mounts fine, but with heavy I/O the server self fences claiming a 
>>> write timeout:
>>>
>>> (16,2):o2hb_write_timeout:164 ERROR: Heartbeat write timeout to 
>>> device etherd/e0.1p1 after 12000 milliseconds
>>> (16,2):o2hb_stop_all_regions:1789 ERROR: stopping heartbeat on all 
>>> active regions.
>>> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this 
>>> system by panicing
>>>
>>> It is my understanding that OCFS is expecting that the only 
>>> heartbeat available to be on disk the same disk that I am writing to?
>>>
>>> Is there any way like with other clustering setups to setup a 
>>> different or even multiple heartbeats?  On a crossover between 
>>> servers, or on a private interface?
>>> Seems like putting it only on the disk, that may have heavy IO is 
>>> going to cause problems.
>>>
>>> Any advice on setting up the heartbeats would be greatly appreciated.
>>>
>>> Thanks,
>>>
>>> -JPH
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>   



More information about the Ocfs2-users mailing list